Search
Searching in Frappe is managed by the Search module. It is a wrapper for Whoosh a full text search library written in Python.
You can extend the FullTextSearch
class to create a search class for a specific requirement. For example the WebsiteSearch
is a wrapper for indexing public facing web pages and exposing a search.
The FullTextSearch
class
Each FullTextSearch (FTS) instance holds a Schema defined by the class itself. That means, a specific FTS implementation will have it's specific schema. You can create a new implementation if you wish to index with a different schema. Along with this the FTS
class has other controllers to facilitate creating, updating and querying the index.
Extending the FTS class
When initializing a FTS based class, you need to provide an index name. On instantiation, the following params are initialized
- index\_name
: name of the index provided.
- index\_path
: path of the index in the sites folder
- schema
: return by the get\_schema
function
- id
: id used to recognize the document in the index
Once instantiated you can run the build
function. It gets all the documents from get\_items\_to\_index
, the documents are a list of frappe.\_dict
(frappe dicts) conforming to the defined schema. These documents are then added to the index and written to the file.
You can search the index using the search
method of the FTS class. These functions are documented in the API reference here.
An example implementation for blog will look like the following:
class BlogWrapper(FullTextSearch):
# Default Schema
# def get\_schema(self):
# return Schema(name=ID(stored=True), content=TEXT(stored=True))
# def get\_id(self):
# return "name"
def get\_items\_to\_index(self):
docs = []
for blog\_name in get\_all\_blogs():
docs.append(get\_document\_to\_index(blog\_name))
return docs
def get\_document\_to\_index(self, name):
blog = frappe.get\_doc("Blog Post", name)
return frappe.\_dict(name=name, content=blog.content)
def parse\_result(self, result):
return result["name"]
get\_items\_to\_index
: Get all routes to be indexed, this includes the static pages in www/ and routes from published documentsget\_document\_to\_index
: Render a page and parse it using BeautifulSoupparse\_result
: all the search results are parsed using this function