OpenSearch Notes

    Table of contents
    1. 1. Implementation Notes

    For Deki Wiki 1.10 search results will be returned in OpenSearch format.  The OpenSearch development is being done in a separate branch (/public/dekiwiki/search) and be merged into trunk when it's completed.

    Implementation Notes

    • added:  GET:site/search?format=opensearch to return an OpenSearch-compliant atom feed.
      • The PHP search engine (/extensions/LuceneSearch.php) uses the atom feed to display results.  The link dialog will continue to use format=search until we decided to update that code.
    • added:  GET:site/search/description to return the OpenSearch description document (used to discover the Uri format of the search engine)
    • added a preview field to the lucene index
      • this field holds the first 1024 characters of a page (without markup) or the first 1024 characters of an attachment if search filter was available to convert the attachment to plain text
    • OpenSearch has a <totalResults> element.  Since we apply permissions to search results (and those permissions aren't stored in the lucene index) I had to hard-code the totalResults value to some arbitrarily high number (100,000)
      • A possible workaround for this is to actually store page permissions in the lucene index.  However, this would require us to re-index a document/attachment each time page permissions change.
    • Paging:  The paging is done via offset/limit.  If no atom:entry elements appear in the feed it's assumed we've reached the end of the results.  We might also want to consider implementing
    • Output format: OpenSearch results can be returned in RSS 2.0, Atom 1.0, and html format.  I've only implemented and atom feed at this time.
    • Extension elements: The atom feed returned by Deki Wiki's OpenSearch adds some additonal information to search results (which the client (PHP) uses to provide additional context. 
      • The following namespaces is used to denote Deki Wiki extension elements:  xmlns:dekilucene="http://services.mindtouch.com/deki/d...06/luceneindex"
      • <dekilucene:path>Deki_Wiki/Installation_and_Upgrade/Installation_FAQ/Deki_Wiki_Dependency_Lists</dekilucene:path>
        • For pages, this is the path within the heirarchy
        • for attachments, this is the path to the page where it's attached
      • <dekilucene:size>4443</dekilucene:size> - the number of characters in the page or the size (in bytes) of an attachment
      • <dekilucene:wordcount>367</dekilucene:wordcount> - The number of words in a page.  This value is always 0 for attachments
      • <dekilucene:id.file>1831</dekilucene:id.file> - the id of the file (this element doesn't exist for page entries)
      • <dekilucene:page.parent id="4037" path="Deki_Wiki/Installation_and_Upgrade/Installation_FAQ" title="Installation FAQ" href="http://dekidev/@api/deki/pages/4037"/> - This is provided so the client (PHP) can display a link to the parent page without having to hit the API again
        • For files, this is the page it's attached to

     



    Tag page (Edit tags)
    • No tags
    You must login to post a comment.
    Powered by MindTouch 2010
    Powered by MindTouch 2010