2021-06-05
I have a lot of alerts configured with Google Scholar for various
research interests. It’s a very cool concept, setting up a keyword
search like blast fragmentation shockwave
and Google
sending you a summary email of new research that matches.
However, this can generate a lot of email each week that needs to be
sifted through (as of this writing it is about 60 emails or so a week
for me). I developed a simple tool to help me. It can read emails from
Google Scholar and Research Gate for links to
articles and PDF. You have to save the email to .eml
format
somewhere on your disk. Point the script to that folder. The script will
read them all. It will search for all href
tags
deduplicating the links, listing them along with the description.
Optionally, it can load PDF links directly in your browser or open a CSV
list of links up in your favorite spreadsheet.
This script makes dealing with large volumes of alerts much more efficient.
NOTE: In Libre Office Calc, use the
hyperlink
function on theURL
column to create clickable links that will open automatically in your browser.
To work with the tool you will need Python 3.9
and
virtualenv
installed. You will also need to clone the git
repository:
cd ~/repositories
git clone https://github.com/TroyWilliams3687/extract_email
Once the git repository has been cloned, run the make file to construct the virtual environment:
make
NOTE: You will need an environment variable called
python
that points to your local python bin folder. It should look something like:
echo $python
~/opt/python_3.9.5/bin
Activate the virtual environment:
. .venv/bin/activate
Or you can use make:
make shell
Execute the script:
extract "~/tmp/extract tbird email" --verbose --launch-pdf
OR
extract "~/tmp/extract tbird email" --verbose --launch-csv