Search
Searching in Frappe is managed by the Search module. It is a wrapper for Whoosh a full text search library written in Python.
You can extend the FullTextSearch
class to create a search class for a specific requirement. For example the WebsiteSearch
is a wrapper for indexing public facing web pages and exposing a search.
The FullTextSearch
class
Each FullTextSearch (FTS) instance holds a Schema defined by the class itself. That means, a specific FTS implementation will have it's specific schema. You can create a new implementation if you wish to index with a different schema. Along with this the FTS
class has other controllers to facilitate creating, updating and querying the index.
Extending the FTS class
When initializing a FTS based class, you need to provide an index name. On instantiation, the following params are initialized
- index_name
: name of the index provided.
- index_path
: path of the index in the sites folder
- schema
: return by the get_schema
function
- id
: id used to recognize the document in the index
Once instantiated you can run the build
function. It gets all the documents from get_items_to_index
, the documents are a list of frappe._dict
(frappe dicts) conforming to the defined schema. These documents are then added to the index and written to the file.
You can search the index using the search
method of the FTS class. These functions are documented in the API reference here.
An example implementation for blog will look like the following:
class BlogWrapper(FullTextSearch):
# Default Schema
# def get_schema(self):
# return Schema(name=ID(stored=True), content=TEXT(stored=True))
# def get_id(self):
# return "name"
def get_items_to_index(self):
docs = []
for blog_name in get_all_blogs():
docs.append(get_document_to_index(blog_name))
return docs
def get_document_to_index(self, name):
blog = frappe.get_doc("Blog Post", name)
return frappe._dict(name=name, content=blog.content)
def parse_result(self, result):
return result["name"]
get_items_to_index
: Get all routes to be indexed, this includes the static pages in www/ and routes from published documentsget_document_to_index
: Render a page and parse it using BeautifulSoupparse_result
: all the search results are parsed using this function