search_engine_parser.core package¶
Subpackages¶
- search_engine_parser.core.engines package
- Submodules
- search_engine_parser.core.engines.aol module
- search_engine_parser.core.engines.ask module
- search_engine_parser.core.engines.baidu module
- search_engine_parser.core.engines.bing module
- search_engine_parser.core.engines.coursera module
- search_engine_parser.core.engines.duckduckgo module
- search_engine_parser.core.engines.github module
- search_engine_parser.core.engines.google module
- search_engine_parser.core.engines.googlescholar module
- search_engine_parser.core.engines.myanimelist module
- search_engine_parser.core.engines.stackoverflow module
- search_engine_parser.core.engines.yahoo module
- search_engine_parser.core.engines.yandex module
- search_engine_parser.core.engines.youtube module
- Module contents
Submodules¶
search_engine_parser.core.base module¶
@desc Base class inherited by every search engine
-
class
search_engine_parser.core.base.
BaseSearch
[source]¶ Bases:
object
-
async_search
(query=None, page=1, cache=True, **kwargs)[source]¶ Query the search engine but in async mode
Parameters: - query (str) – the query to search for
- page (int) – Page to be displayed, defaults to 1
Returns: dictionary. Containing titles, links, netlocs and descriptions.
-
cache_handler
¶
-
clear_cache
(all_cache=False)[source]¶ Triggers the clear cache function for a particular engine
Parameters: all_cache – if True, deletes for all engines
-
get_params
(query=None, page=None, offset=None, **kwargs)[source]¶ This function should be overwritten to return a dictionary of query params
-
get_source
(url, cache=True)[source]¶ Returns the source code of a webpage. Also sets the _cache_hit if cache was used
Return type: string Parameters: url – URL to pull it’s source code Returns: html source code of a given URL.
-
name
= None¶
-
parse_result
(results, **kwargs)[source]¶ Runs every entry on the page through parse_single_result
Parameters: results (list[bs4.element.ResultSet]) – Result of main search to extract individual results Returns: dictionary. Containing lists of titles, links, descriptions and other possible returns. Return type: dict
-
parse_single_result
(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶ Every div/span containing a result is passed here to retrieve title, link and descr
-
search
(query=None, page=1, cache=True, **kwargs)[source]¶ Query the search engine
Parameters: - query (str) – the query to search for
- page (int) – Page to be displayed, defaults to 1
Returns: dictionary. Containing titles, links, netlocs and descriptions.
-
search_url
= None¶
-
summary
= None¶
-
-
class
search_engine_parser.core.base.
ReturnType
[source]¶ Bases:
enum.Enum
An enumeration.
-
DESCRIPTION
= 'descriptions'¶
-
FULL
= 'full'¶
-
LINK
= 'links'¶
-
TITLE
= 'titles'¶
-
-
class
search_engine_parser.core.base.
SearchItem
[source]¶ Bases:
dict
SearchItem is a dict of results containing keys (titles, descriptions, links and other additional keys dependending on the engine) >>> result <search_engine_parser.core.base.SearchItem object at 0x7f907426a280> >>> result[“description”] Some description >>> result[“descriptions”] Same description
-
class
search_engine_parser.core.base.
SearchResult
[source]¶ Bases:
object
The SearchResults after the searching
>>> results = gsearch.search("preaching the choir", 1) >>> results <search_engine_parser.core.base.SearchResult object at 0x7f907426a280>
The object supports retreiving individual results by iteration of just by type >>> results[0] # Returns the first result <SearchItem> >>> results[“descriptions”] # Returns a list of all descriptions from all results
It can be iterated like a normal list to return individual SearchItem
search_engine_parser.core.cli module¶
@desc Making use of the parser through cli
search_engine_parser.core.exceptions module¶
@desc Exceptions