4 Ways to search the Web from the Linux Terminal

Configurare noua (How To)

Situatie

Solutie

The Most Familiar Way: Google in a Browser

Let’s face it, Google has been so dominant for the past 25 years that the search engine’s name has become a verb meaning “to search the internet.” Chances are, you use Google in a graphical web browser today, and any alternative must deliver in the same way.

Therefore, searching Google in a browser, from the command line instead of a GUI, will probably be your default starting place. You may already use a text-based browser like Lynx; if not, you’ll need to prepare for an adjustment period. Viewing a text-only version of a website will expose all its flaws, from missing alt-text to an over-dependency on JavaScript.

Having said that, Google’s search interface is known for its minimalism and this helps a lot when viewing it in a text browser:

It’s pretty easy to navigate to the search box, enter your search, and see the results. You can even go to the results page immediately, if you can remember the URL and parameter:

$ lynx http://www.google.com/search?q=search+from+commandline

Exploring those results, though, isn’t quite as nice an experience:

You may prefer to use a different service. DuckDuckGo is a privacy-focused search engine that uses a vast number of sources, including Bing, for its results. With a text browser, it has an even more minimal interface than Google’s:

And DuckDuckGo’s search results are much easier to read than Google’s, despite still being text-only. They are clearly numbered and separated by white space:

The other significant option is Bing, but I’ve found Microsoft’s search engine difficult to use with a text browser. In fact, if you really need to use Bing, I recommend just going to its results directly rather than trying to use its search form:

$ lynx “https://www.bing.com/search?q=search+from+commandline”

A Terminal Front-End to DuckDuckGo: ddgr

If you’re happy with DuckDuckGo’s service, then there’s more good news in the form of a terminal client. ddgr is a command-line tool, written in Python, that searches DuckDuckGo. It presents search results in your terminal, each with a clear title, description, and domain or URL.

By default, ddgr operates in an interactive mode which provides the easiest way of opening a specific result. Start by entering a search on the command line:

ddgr linux

ddgr will display a page of results, ten of them by default:

At the bottom, ddgr presents a command line that you can use interactively to refine results or take further action. Enter the number of the result you want to view. If you’ve configured a terminal browser, using the widely-supported BROWSER environment variable, it will open and show the result you chose. You can also have results open in your graphical browser if you prefer.

Limited Search Results With the DuckDuckGo API

At first glance, DuckDuckGo’s API seems promising, and it’s very easy to use. For example, you can get a list of Instant Answer results in JSON format using this endpoint:

https://api.duckduckgo.com/?q=<search-term>&format=json

With curl and a tool like jq to parse JSON, you can get close to fetching useful results in a format that can be used for scripting and further processing:

$ curl -s ‘https://api.duckduckgo.com/?q=linux&format=json’ |
jq -r ‘.AbstractURL’

However, these results are very limited because of how DuckDuckGo licenses its syndicated results. It’s worth experimenting with search terms in the context of your own requirements, but be sure to realize that this approach will not produce the same results as a search on duckduckgo.com.

A Comprehensive Third-Party Search API—at a Cost

SerpApi is an unofficial alternative to DuckDuckGo’s own API. The difference is that it can work across several search engines and provides full-text search results. The drawback is you’ll need to pay or deal with the 100-search monthly limit on the free plan. If you can accept those limitations, SerpApi is a good choice for scripted search results:

$ curl -s ‘https://serpapi.com/search?engine=duckduckgo&q=<search-term>&api_key=<api-key>’ |
jq ‘.organic_results[0]’

Although HTTPS helps to prevent certain types of data leaks, typing your API key on the command line is a security risk. To mitigate that, investigate curl’s options, in particular -K, -G, and -d.

Tip solutie

Permanent

Voteaza

(11 din 19 persoane apreciaza acest articol)

Despre Autor