search_engine_parser.core.engines package¶

Submodules¶

search_engine_parser.core.engines.aol module¶

@desc Parser for AOL search results

class search_engine_parser.core.engines.aol.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Aol for string

name = 'AOL'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div class=”algo-sr”>
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses AOL for a search query

search_url = 'https://search.aol.com/aol/search?'¶

summary = '\t According to netmarketshare, the old time famous AOL is still in the top 10 search engines with a market share that is close to 0.06%. The AOL network includes many popular web sites like engadget.com, techchrunch.com and the huffingtonpost.com. \nOn June 23, 2015, AOL was acquired by Verizon Communications.'¶

search_engine_parser.core.engines.ask module¶

@desc Parser for ask search results

class search_engine_parser.core.engines.ask.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Ask for string

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'Ask'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div class=”PartialSearchResults-item”>
Returns:	parsed title, link and description of single result
Return type:	str, str, str

parse_soup(soup)[source]¶: Parses Ask Search Soup for results

search_url = 'https://www.ask.com/web?'¶

summary = '\t Formerly known as Ask Jeeves, Ask.com receives approximately 0.42% of the search share. ASK is based on a question/answer format where most questions are answered by other users or are in the form of polls.\nIt also has the general search functionality but the results returned lack quality compared to Google or even Bing and Yahoo.'¶

search_engine_parser.core.engines.baidu module¶

@desc Parser for Baidu search results

class search_engine_parser.core.engines.baidu.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Baidu for string

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'Baidu'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.Tag) – single result found in div with a numeric id
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses Baidu for a search query

search_url = 'https://www.baidu.com/s?'¶

summary = "\tBaidu, Inc. is a Chinese multinational technology company specializing in Internet-related services and products and artificial intelligence (AI), headquartered in Beijing's Haidian District.\n\tIt is one of the largest AI and internet companies in the world.\n\tBaidu offers various services, including a Chinese search engine, as well as a mapping service called Baidu Maps."¶: Override get_search_url

search_engine_parser.core.engines.bing module¶

@desc Parser for Bing search results

class search_engine_parser.core.engines.bing.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Bing for string

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'Bing'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <li class=”b_algo”>
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses Bing for a search query.

search_url = 'https://www.bing.com/search?'¶

summary = '\tBing is Microsoft’s attempt to challenge Google in search, but despite their efforts they still did not manage to convince users that their search engine can be an alternative to Google.\n\tTheir search engine market share is constantly below 10%, even though Bing is the default search engine on Windows PCs.'¶

search_engine_parser.core.engines.coursera module¶

@desc Parser for coursera search results

class search_engine_parser.core.engines.coursera.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Coursera for string

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'Coursera'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div class=”gs_r gs_or gs_scl”>
Returns:	parsed title, link, description, file link, result type of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses Coursera Search Soup for results

search_url = 'https://www.coursera.org/search?'¶

summary = '\tCoursera is an American online learning platform founded by Stanford professors Andrew Ng and Daphne Koller that offers massive open online courses, specializations, and degrees.'¶

search_engine_parser.core.engines.duckduckgo module¶

@desc Parser for DuckDuckGo search results

class search_engine_parser.core.engines.duckduckgo.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches DuckDuckGo for string

base_url = 'https://www.duckduckgo.com'¶

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'DuckDuckGo'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div id=”r1-{id}”>
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses DuckDuckGo Search Soup for a query results

search_url = 'https://www.duckduckgo.com/html/?'¶

summary = '\tHas a number of advantages over the other search engines. \n\tIt has a clean interface, it does not track users, it is not fully loaded with ads and has a number of very nice features (only one page of results, you can search directly other web sites etc).\n\tAccording to DuckDuckGo traffic stats [December, 2018], they are currently serving more than 30 million searches per day.'¶

search_engine_parser.core.engines.github module¶

@desc Parser for GitHub search results

class search_engine_parser.core.engines.github.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches GitHub for string

base_url = 'https://github.com'¶

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'GitHub'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in container element
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses GitHub for a search query.

search_url = 'https://github.com/search?'¶

summary = '\tGitHub is an American company that provides hosting for software development version control using Git. It is a subsidiary of Microsoft, which acquired the company in 2018 for $7.5 billion.\n\tIt offers all of the distributed version control and source code management (SCM) functionality of Git as well as adding its own features.\n\tAs of May 2019, GitHub reports having over 37 million users and more than 100 million repositories (including at least 28 million public repositories), making it the largest host of source code in the world.'¶

search_engine_parser.core.engines.google module¶

@desc Parser for google search results

class search_engine_parser.core.engines.google.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Google for string

base_url = 'https://www.google.com/'¶

get_params(query=None, offset=None, page=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'Google'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div class=”g”>
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses Google Search Soup for results

parse_url(url)[source]¶

summary = '\tNo need for further introductions. The search engine giant holds the first place in search with a stunning difference of 65% from second in place Bing.\n\tAccording to the latest netmarketshare report (November 2018) 73% of searches were powered by Google and only 7.91% by Bing.\n\tGoogle is also dominating the mobile/tablet search engine market share with 81%!'¶

search_engine_parser.core.engines.googlescholar module¶

@desc Parser for google scholar search results

class search_engine_parser.core.engines.googlescholar.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Google Scholar for string

get_params(query=None, offset=None, page=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'GoogleScholar'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div class=”gs_r gs_or gs_scl”>
Returns:	parsed title, link, description, file link, result type of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses Google Scholar Search Soup for results

search_url = 'https://scholar.google.gr/scholar?'¶

summary = '\tGoogle Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines.'¶

search_engine_parser.core.engines.myanimelist module¶

@desc Parser for MyAnimeList search results

class search_engine_parser.core.engines.myanimelist.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches MyAnimeList for string

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'MyAnimeList'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.Tag) – single result found in div with a numeric id
Returns:	parsed title, link and description of single result
Return type:	str, str, str

parse_soup(soup)[source]¶: Parses MyAnimeList for a search query

search_url = 'https://myanimelist.net/anime.php?'¶

summary = '\tMyAnimeList, often abbreviated as MAL, is an anime and manga socialnetworking and social cataloging application website.\n\tThe site provides its users with a list-like system to organizeand score anime and manga.\n\tIt facilitates finding users who sharesimilar tastes and provides a large database on anime and manga.\n\tThesite claims to have 4.4 million anime and 775,000 manga entries.\n\tIn 2015, the site received over 120 million visitors a month.'¶

search_engine_parser.core.engines.stackoverflow module¶

@desc Parser for AOL search results

class search_engine_parser.core.engines.stackoverflow.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches StackOverflow for string

base_url = 'https://stackoverflow.com'¶

get_params(query=None, offset=None, page=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'StackOverflow'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div class=”summary”>
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses StackOverflow for a search query

search_url = 'https://stackoverflow.com/search?'¶

summary = '\tStack Overflow is a question and answer site for professional and enthusiast programmers.\n\tIt is a privately held website, the flagship site of the Stack Exchange Network, created in 2008 by Jeff Atwood and Joel Spolsky.\n\tIt features questions and answers on a wide range of topics in computer programming. It was created to be a more open alternative to earlier question and answer sites such as Experts-Exchange'¶

search_engine_parser.core.engines.yahoo module¶

@desc Parser for Yahoo search results

class search_engine_parser.core.engines.yahoo.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Yahoo for string

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'Yahoo'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <div class=”Sr”>
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses Yahoo for a search query

search_url = 'https://search.yahoo.com/search?'¶

summary = '\tYahoo is one the most popular email providers and holds the fourth place in search with 3.90% market share.\n\tFrom October 2011 to October 2015, Yahoo search was powered exclusively by Bing. \n\tSince October 2015 Yahoo agreed with Google to provide search-related services and since then the results of Yahoo are powered both by Google and Bing. \n\tYahoo is also the default search engine for Firefox browsers in the United States (since 2014).'¶

search_engine_parser.core.engines.yandex module¶

@desc Parser for Yandex search results

class search_engine_parser.core.engines.yandex.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches Yandex for string

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'Yandex'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <li class=”serp-item”>
Returns:	parsed title, link and description of single result
Return type:	str, str, str

parse_soup(soup)[source]¶: Parses Yandex for a search query

search_url = 'https://yandex.com/search/?'¶

summary = '\tYandex is the largest technology company in Russia and the largest search engine on the internet in Russian, with a market share of over 52%.\n\tThe Yandex.ru home page is the 4th most popular website in Russia.\n\tIt also has the largest market share of any search engine in the Commonwealth of Independent States and is the 5th largest search engine worldwide after Google, Baidu, Bing, and Yahoo!'¶

search_engine_parser.core.engines.youtube module¶

@desc Parser for YouTube search results

class search_engine_parser.core.engines.youtube.Search[source]¶

Bases: search_engine_parser.core.base.BaseSearch

Searches YouTube for string

base_url = 'https://youtube.com'¶

get_params(query=None, page=None, offset=None, **kwargs)[source]¶: This function should be overwritten to return a dictionary of query params

name = 'YouTube'¶

parse_single_result(single_result, return_type=<ReturnType.FULL: 'full'>, **kwargs)[source]¶

Parses the source code to return

Parameters:	single_result (bs4.element.ResultSet) – single result found in <ytd-video-renderer class=”style-scope”>
Returns:	parsed title, link and description of single result
Return type:	dict

parse_soup(soup)[source]¶: Parses YouTube for a search query.

search_url = 'https://youtube.com/results?'¶

summary = "\tYouTube is an American video-sharing website headquartered in San Bruno, California. Three former PayPal employees—Chad Hurley, Steve Chen, and Jawed Karim—created the service in February 2005.\n\tGoogle bought the site in November 2006 for US$1.65 billion; YouTube now operates as one of Google's subsidiaries. As of May 2019, more than 500 hours of video content are uploaded to YouTube every minute"¶

search_engine_parser.core.engines package¶

Submodules¶

search_engine_parser.core.engines.aol module¶

search_engine_parser.core.engines.ask module¶

search_engine_parser.core.engines.baidu module¶

search_engine_parser.core.engines.bing module¶

search_engine_parser.core.engines.coursera module¶

search_engine_parser.core.engines.duckduckgo module¶

search_engine_parser.core.engines.github module¶

search_engine_parser.core.engines.google module¶

search_engine_parser.core.engines.googlescholar module¶

search_engine_parser.core.engines.myanimelist module¶

search_engine_parser.core.engines.stackoverflow module¶

search_engine_parser.core.engines.yahoo module¶

search_engine_parser.core.engines.yandex module¶

search_engine_parser.core.engines.youtube module¶

Module contents¶