|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Semaphore | +--fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Mutex | +--fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLProducer | +--fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLPipe | +--fr.gouv.culture.oai.SynchronizedOAIObjectImpl | +--fr.gouv.culture.oai.AbstractOAIHarvester | +--fr.gouv.culture.sdx.oai.AbstractDocumentBaseOAIHarvester
Created by IntelliJ IDEA. User: rpandey Date: May 12, 2003 Time: 12:32:16 PM To change this template use Options | File Templates.
Nested Class Summary |
Nested classes inherited from class fr.gouv.culture.oai.OAIObject |
fr.gouv.culture.oai.OAIObject.Node |
Field Summary | |
protected static java.lang.String |
ATTRIBUTE_NAME_ADMIN_EMAIL
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_FROM
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_KEEP_DELETED_RECORD
|
protected static java.lang.String |
ATTRIBUTE_NAME_METADATA_PREFIX
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_NAME
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_SDX_REPOSITORY
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_SET
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_UNTIL
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_UPDATE
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_URL
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NAME_USER_AGENT
Configuration node name |
protected static java.lang.String |
ATTRIBUTE_NO_RECORDS_PER_BATCH
Configuration node name |
protected Database |
database
Underlying database to store any info |
protected java.util.ArrayList |
deletedDocs
|
protected DocumentBase |
docbase
The underlying document base |
protected java.lang.String |
docbaseId
Id of the underlying document base |
protected java.util.Hashtable |
docBaseProps
Properties from the document base |
protected static java.lang.String |
ELEMENT_NAME_OAI_DATA_PROVIDERS
Configuration node name |
protected static java.lang.String |
ELEMENT_NAME_OAI_IDENTIFIER
Configuration node name |
protected static java.lang.String |
ELEMENT_NAME_OAI_VERB
Configuration node name |
protected java.lang.String |
ELEMENT_NAME_PIPELINE
|
protected java.io.FileOutputStream |
fileOs
|
protected java.io.File |
harvestDoc
|
protected java.util.ArrayList |
harvestedDocs
|
protected IDGenerator |
harvesterIdGen
IDGenerator for this object |
protected boolean |
keepDeletedRecords
|
protected static java.lang.String |
NO_DOCS_DELETED
|
protected static java.lang.String |
NO_DOCS_HARVESTED
|
protected int |
noRecordsPerBatch
|
protected static java.lang.String |
OAI_FAILED_HARVEST
|
protected static java.lang.String |
OAI_FROM
|
protected static java.lang.String |
OAI_HARVEST_ID
|
protected static java.lang.String |
OAI_HARVESTER_LAST_UPDATED
|
protected static java.lang.String |
OAI_HARVESTER_RESUMPTION_TOKEN
|
protected static java.lang.String |
OAI_IDENTIFIER
|
protected static java.lang.String |
OAI_METADATA_PREFIX
|
protected static java.lang.String |
OAI_SET
|
protected static java.lang.String |
OAI_UNTIL
|
protected static java.lang.String |
OAI_VERB
|
protected Pipeline |
pipe
Pre-indexation pipeline |
protected fr.gouv.culture.util.apache.avalon.cornerstone.services.scheduler.TimeScheduler |
scheduler
Time scheduler for stored requests |
protected java.util.Hashtable |
storedRequests
Requests in application.xconf |
protected java.util.Hashtable |
storeRepositoriesRefs
References to the underlying documentbase's/application's repositories |
protected java.io.File |
tempDir
|
protected java.lang.String |
TEMPFILE_SUFFIX
|
protected XMLDocument |
urlResource
|
Fields inherited from class fr.gouv.culture.oai.AbstractOAIHarvester |
adminEmails, captureElemContent, captureRecord, currentDatestamp, currentMetadtaUrlIdentifier, currentOaiIdentifier, firstXmlConsumer, identifierName, manager, OAI_REPOSITORY_URL, OAI_REQUEST_URL, repoUrl, requestParams, requestUrl, responseDate, resumptionToken, sBuff, userAgent |
Fields inherited from class fr.gouv.culture.oai.SynchronizedOAIObjectImpl |
logger |
Fields inherited from class fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLPipe |
synchronizedXmlConsumerAcquired |
Fields inherited from class fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLProducer |
synchronizedXmlConsumer |
Fields inherited from interface fr.gouv.culture.oai.OAIObject |
HTTP_HEADER_NAME_FROM, HTTP_HEADER_NAME_USER_AGENT, NUMBER_RECORDS_PER_RESPONSE |
Constructor Summary | |
AbstractDocumentBaseOAIHarvester(DocumentBase base)
Basic constructor |
Method Summary | |
protected void |
captureRecord()
Ends the capture of an oai record and renames the file written in cocoon's work directory to correspond to it's id TODO: is this a bad idea, id's could be long? |
protected void |
captureResourceFromUrlIdentifier()
Captures the xml from a url taken from an oai record and adds it to the oai-record as a sibling of the |
void |
configure(org.apache.avalon.framework.configuration.Configuration configuration)
Configures this object |
protected void |
configureAdminEmails(org.apache.avalon.framework.configuration.Configuration configuration)
Configures a list of admin emails can be sub-elements, a single attribute, or both |
protected void |
configureDatabase(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the internal database |
protected void |
configureDataProviders(org.apache.avalon.framework.configuration.Configuration configuration)
Configures data providers info that can be reused and from which requests can be automatically executed |
protected void |
configureHarvestIDGenerator(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the id generator for harvests |
protected void |
configurePipeline(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the preIndexation pipeline |
protected void |
configureStoreRepositories(java.lang.String repoUrl,
org.apache.avalon.framework.configuration.Configuration oaiRepoConf)
Configures the repositories to which data will be stored based upon their repository url |
protected void |
configureUpdateTriggers(java.lang.String requestUrl,
org.apache.avalon.framework.configuration.Configuration updateConf)
Configures time triggers for stored requests |
protected void |
deleteTempDir()
Deletes the directory represented by the tempDir class field |
void |
endElement(java.lang.String s,
java.lang.String s1,
java.lang.String s2)
Receive notification of the end of an element. |
protected java.lang.String |
generateNewHarvestId()
Generates an id to associate with a harvest |
protected java.lang.String |
getHarvesterId()
Returns an id for this harvester based upon the underlying document base id |
protected IndexParameters |
getIndexParameters()
Builds simple index parameters for indexation of oai records into the undelryi |
protected java.lang.String |
getIsoDate()
Get's the current date in iso8601 format |
protected java.io.File |
getNewTempDir()
Creates a new temporary directory for writing harvested records before the will be indexed |
protected void |
handleResumptionToken()
Handles the resumption token by issuing another request based upon the request from which the resumption token was received. |
protected void |
initTempDir()
Establishes the tempDir class field |
java.util.Date |
lastUpdated()
Retrieves the time when the harvester was last updated |
protected void |
prepareRecordCapture()
Sets up resources to capture an oai record |
protected void |
prepareRecordForDeletion()
Sets up resources to delete an oai record |
protected void |
prepareResourceFromUrlIdentifierCapture()
Prepares to read a url value from an oai record and retrieve the XML behind. |
void |
purgePastHarvestsData()
Destroys all summary data pertaining to past harvests but not the actual oai records harvested |
protected void |
resetAllFields()
Resets necessary class fields |
protected void |
resetRecordCaptureFields(boolean deleteDoc)
Resets the class fields for record capture possibility deleting the current harvetDoc
object underlying file |
protected void |
saveCriticalFields(boolean dataHarvested)
Saves critical data about a harvest |
void |
sendPastHarvestsSummary()
Sends sax events to the current consumer with summary details of the all the past harvests |
void |
sendStoredHarvestingRequests()
Sends the details of stored harvesting requests to the current consumer |
void |
setProperties(java.util.Hashtable props)
Set's the properties for this object |
protected boolean |
shouldHarvestDocument()
Querys the underlying data structures based upon current sax flow position/set class fields and determines whether an oai record should be harvested |
void |
startElement(java.lang.String s,
java.lang.String s1,
java.lang.String s2,
org.xml.sax.Attributes attributes)
Receive notification of the beginning of an element. |
protected void |
storeFailedHarvestData(java.lang.Exception e)
Stores data about harvesting failures caused by problems other than oai errors sent from a queried repository |
protected boolean |
storeHarvestedData()
Reads the documents from tempDir
and indexes them in the corresponding document
base, any marked deletions will be carried out
as well |
void |
targetTriggered(java.lang.String triggerName)
Triggers a oai request to a repository based upon a trigger name (also a request url) |
Methods inherited from class fr.gouv.culture.oai.AbstractOAIHarvester |
abortRecordCapture, characters, compose, getAdminEmails, getHarvestParameters, handleErrors, receiveRequest, receiveSynchronizedRequest, recycle, resetResumptionToken, setAdminEmails, setConsumer, setIdentifierName, toSAX |
Methods inherited from class fr.gouv.culture.oai.SynchronizedOAIObjectImpl |
enableLogging, sendElement, sendElementContent |
Methods inherited from class fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLPipe |
acquireSynchronizedXMLConsumer, comment, endCDATA, endDocument, endDTD, endEntity, endPrefixMapping, ignorableWhitespace, processingInstruction, releaseSynchronizedXMLConsumer, setDocumentLocator, skippedEntity, startCDATA, startDocument, startDTD, startEntity, startPrefixMapping |
Methods inherited from class fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLProducer |
setConsumer, setSynchronizedConsumer |
Methods inherited from class fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Mutex |
isAcquired |
Methods inherited from class fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Semaphore |
acquire, attempt, getTokens, release |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled |
enableLogging |
Methods inherited from interface org.xml.sax.ContentHandler |
endDocument, endPrefixMapping, ignorableWhitespace, processingInstruction, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping |
Methods inherited from interface org.xml.sax.ext.LexicalHandler |
comment, endCDATA, endDTD, endEntity, startCDATA, startDTD, startEntity |
Methods inherited from interface fr.gouv.culture.util.apache.cocoon.xml.SynchronizedXMLProducer |
setConsumer |
Methods inherited from interface org.apache.avalon.excalibur.concurrent.Sync |
acquire, attempt, release |
Field Detail |
protected java.lang.String ELEMENT_NAME_PIPELINE
protected DocumentBase docbase
protected java.lang.String docbaseId
protected java.util.Hashtable docBaseProps
protected Pipeline pipe
protected Database database
protected java.util.Hashtable storedRequests
protected java.util.Hashtable storeRepositoriesRefs
protected fr.gouv.culture.util.apache.avalon.cornerstone.services.scheduler.TimeScheduler scheduler
protected IDGenerator harvesterIdGen
protected java.lang.String TEMPFILE_SUFFIX
protected java.io.File tempDir
protected java.io.File harvestDoc
protected java.io.FileOutputStream fileOs
protected XMLDocument urlResource
protected java.util.ArrayList deletedDocs
protected java.util.ArrayList harvestedDocs
protected boolean keepDeletedRecords
protected int noRecordsPerBatch
protected static final java.lang.String ELEMENT_NAME_OAI_DATA_PROVIDERS
protected static final java.lang.String ELEMENT_NAME_OAI_VERB
protected static final java.lang.String ELEMENT_NAME_OAI_IDENTIFIER
protected static final java.lang.String ATTRIBUTE_NAME_NAME
protected static final java.lang.String ATTRIBUTE_NAME_ADMIN_EMAIL
protected static final java.lang.String ATTRIBUTE_NAME_USER_AGENT
protected static final java.lang.String ATTRIBUTE_NAME_URL
protected static final java.lang.String ATTRIBUTE_NAME_UPDATE
protected static final java.lang.String ATTRIBUTE_NAME_METADATA_PREFIX
protected static final java.lang.String ATTRIBUTE_NAME_SDX_REPOSITORY
protected static final java.lang.String ATTRIBUTE_NAME_FROM
protected static final java.lang.String ATTRIBUTE_NAME_UNTIL
protected static final java.lang.String ATTRIBUTE_NAME_SET
protected static final java.lang.String ATTRIBUTE_NAME_KEEP_DELETED_RECORD
protected static final java.lang.String ATTRIBUTE_NO_RECORDS_PER_BATCH
protected static final java.lang.String OAI_HARVEST_ID
protected static final java.lang.String OAI_FAILED_HARVEST
protected static final java.lang.String OAI_HARVESTER_LAST_UPDATED
protected static final java.lang.String OAI_HARVESTER_RESUMPTION_TOKEN
protected static final java.lang.String OAI_VERB
protected static final java.lang.String OAI_IDENTIFIER
protected static final java.lang.String OAI_METADATA_PREFIX
protected static final java.lang.String OAI_FROM
protected static final java.lang.String OAI_UNTIL
protected static final java.lang.String OAI_SET
protected static final java.lang.String NO_DOCS_DELETED
protected static final java.lang.String NO_DOCS_HARVESTED
Constructor Detail |
public AbstractDocumentBaseOAIHarvester(DocumentBase base)
Method Detail |
public void setProperties(java.util.Hashtable props)
public void configure(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configure
in interface org.apache.avalon.framework.configuration.Configurable
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureDatabase(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureHarvestIDGenerator(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
org.apache.avalon.framework.configuration.ConfigurationException
protected java.lang.String getHarvesterId()
protected void configureAdminEmails(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configuration
-
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureDataProviders(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configuration
-
org.apache.avalon.framework.configuration.ConfigurationException
storedRequests
protected void configureUpdateTriggers(java.lang.String requestUrl, org.apache.avalon.framework.configuration.Configuration updateConf) throws org.apache.avalon.framework.configuration.ConfigurationException
requestUrl
- The request urlupdateConf
- The configuration for updates
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureStoreRepositories(java.lang.String repoUrl, org.apache.avalon.framework.configuration.Configuration oaiRepoConf) throws org.apache.avalon.framework.configuration.ConfigurationException
repoUrl
- The repository/data provider urloaiRepoConf
- The configuration
org.apache.avalon.framework.configuration.ConfigurationException
protected void configurePipeline(org.apache.avalon.framework.configuration.Configuration configuration) throws org.apache.avalon.framework.configuration.ConfigurationException
configuration
-
org.apache.avalon.framework.configuration.ConfigurationException
pipe
protected java.io.File getNewTempDir() throws SDXException, java.io.IOException
SDXException
java.io.IOException
protected void deleteTempDir()
protected void initTempDir() throws SDXException, java.io.IOException
SDXException
java.io.IOException
protected java.lang.String getIsoDate()
protected void prepareRecordCapture() throws org.xml.sax.SAXException
prepareRecordCapture
in class fr.gouv.culture.oai.AbstractOAIHarvester
org.xml.sax.SAXException
protected void captureRecord() throws java.lang.Exception
captureRecord
in class fr.gouv.culture.oai.AbstractOAIHarvester
java.lang.Exception
protected void resetRecordCaptureFields(boolean deleteDoc)
harvetDoc
object underlying file
resetRecordCaptureFields
in class fr.gouv.culture.oai.AbstractOAIHarvester
deleteDoc
- flag for deletion of actual fileprotected void prepareRecordForDeletion()
prepareRecordForDeletion
in class fr.gouv.culture.oai.AbstractOAIHarvester
protected boolean storeHarvestedData() throws SDXException, org.xml.sax.SAXException, org.apache.cocoon.ProcessingException
tempDir
and indexes them in the corresponding document
base, any marked deletions will be carried out
as well
storeHarvestedData
in class fr.gouv.culture.oai.AbstractOAIHarvester
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingException
protected void handleResumptionToken()
handleResumptionToken
in class fr.gouv.culture.oai.AbstractOAIHarvester
protected void prepareResourceFromUrlIdentifierCapture()
prepareResourceFromUrlIdentifierCapture
in class fr.gouv.culture.oai.AbstractOAIHarvester
protected void captureResourceFromUrlIdentifier()
captureResourceFromUrlIdentifier
in class fr.gouv.culture.oai.AbstractOAIHarvester
protected void resetAllFields()
resetAllFields
in class fr.gouv.culture.oai.AbstractOAIHarvester
protected IndexParameters getIndexParameters()
public void sendStoredHarvestingRequests() throws org.xml.sax.SAXException
sendStoredHarvestingRequests
in interface fr.gouv.culture.oai.OAIHarvester
org.xml.sax.SAXException
public void targetTriggered(java.lang.String triggerName)
targetTriggered
in interface fr.gouv.culture.util.apache.avalon.cornerstone.services.scheduler.Target
triggerName
- public void startElement(java.lang.String s, java.lang.String s1, java.lang.String s2, org.xml.sax.Attributes attributes) throws org.xml.sax.SAXException
fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLPipe
startElement
in interface org.xml.sax.ContentHandler
startElement
in class fr.gouv.culture.oai.AbstractOAIHarvester
s
- The Namespace URI, or the empty string if the element has no
Namespace URI or if Namespace
processing is not being performed.s1
- The local name (without prefix), or the empty string if
Namespace processing is not being performed.s2
- The raw XML 1.0 name (with prefix), or the empty string if
raw names are not available.attributes
- The attributes attached to the element. If there are no
attributes, it shall be an empty Attributes object.
org.xml.sax.SAXException
public void endElement(java.lang.String s, java.lang.String s1, java.lang.String s2) throws org.xml.sax.SAXException
fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLPipe
endElement
in interface org.xml.sax.ContentHandler
endElement
in class fr.gouv.culture.oai.AbstractOAIHarvester
s
- The Namespace URI, or the empty string if the element has no
Namespace URI or if Namespace
processing is not being performed.s1
- The local name (without prefix), or the empty string if
Namespace processing is not being performed.s2
- The raw XML 1.0 name (with prefix), or the empty string if
raw names are not available.
org.xml.sax.SAXException
protected boolean shouldHarvestDocument()
shouldHarvestDocument
in class fr.gouv.culture.oai.AbstractOAIHarvester
protected void saveCriticalFields(boolean dataHarvested) throws org.xml.sax.SAXException
saveCriticalFields
in class fr.gouv.culture.oai.AbstractOAIHarvester
dataHarvested
-
org.xml.sax.SAXException
protected java.lang.String generateNewHarvestId()
public void sendPastHarvestsSummary() throws org.xml.sax.SAXException
sendPastHarvestsSummary
in interface fr.gouv.culture.oai.OAIHarvester
org.xml.sax.SAXException
public java.util.Date lastUpdated()
public void purgePastHarvestsData()
purgePastHarvestsData
in interface fr.gouv.culture.oai.OAIHarvester
protected void storeFailedHarvestData(java.lang.Exception e)
storeFailedHarvestData
in class fr.gouv.culture.oai.AbstractOAIHarvester
e
-
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |