Find Notes with Full Text Search

You can now search your notes by keywords when using the command line interface. The latest Dnote CLI release features the find command which performs a blazingly fast full-text search across your locally stored notes. Easily retrieve the information you need without leaving your command line.

a terminal demonstrating the find command
Looking for a note for using `tar` command

To find notes by keywords, you can use dnote find [keyword], or its short version, dnote f [keyword]. For instance, the example above shows a search with a keyword ‘tar.’ The output will contain the book name and the note id so that you can access the content using the view command. Also, the result will show a short snippet of the match with a highlight.

You can use multiple keywords by wrapping the query in double quotation marks. For instance, you can run dnote find "building heap". Dnote will treat space characters in the query as implicit ‘AND’ operator and find the notes that match all terms separated by whitespaces. In the example, Dnote will return notes whose bodies match both ‘building’, and ‘heap.’

Dnote will perform a full-text search on all unicode characters in the note bodies, with an exception of spacing and punctuations. Moreover, it wraps the output of the unicode tokenizer with a porter tokenizer which implements the Porter Stemming Algorithm. This implementation means that the full-text search will not only match all unicode characters but also make use of stemming to match related terms when looking for notes in English. For instance, dnote f mesmerize will match notes that contain such terms as ‘mesmerize’, ‘mesmerizing’, ‘mesmerized.’

Search is very fast. To measure the execution time, let’s do some test search on the Enron Email Dataset, which contains about 500,000 documents. First, we write a small script to load each message to our SQLite database for Dnote:

#!/bin/bash

dataset_root=$1

for d in "$dataset_root"/*/; do
  echo "processing $d"

  for f in `find "$d" -type f`; do 
    value=`cat "$f" | sed 's/"/""/g'`
    sql="INSERT INTO notes (body, uuid, book_uuid, added_on) VALUES (\""$value"\", \""$d$f"\", 1, 1)"

    sqlite3 ~/.dnote/dnote.db "$sql"
  done
done

After running the script, we can confirm that the whole dataset is loaded into Dnote.

▶ sqlite3 ~/.dnote/dnote.db "select count(*) from notes;"
517204

Let’s run some test queries to see the performance on a ThinkPad with an Intel i7 CPU.

a terminal demonstrating the find command
Looking for Starcraft players at Enron
# 3 results
▶ time dnote find starcraft &>/dev/null
dnote find starcraft &> /dev/null  0.01s user 0.01s system 100% cpu 0.011 total


# 42 results
▶ time dnote find "limp bizkit" &>/dev/null
dnote find "limp bizkit" &> /dev/null  0.01s user 0.01s system 109% cpu 0.019 total

# 52604 results
▶ time dnote find "note taking" &>/dev/null
dnote find "note taking" &> /dev/null  7.09s user 0.16s system 102% cpu 7.075 total

It takes 0.1 seconds to match and display a small number of records. It seems that the find command is running a bit slow for matching around 10% of the half million documents. However, situations such as this will no longer be problematic when we implement pagination. The following measurement demonstrates that, with pagination, we can expect the find command to be super snappy even for the last example, matching 52000 records.

# without pagination
▶ time sqlite3 ~/.dnote/dnote.db -cmd ".timer ON" "SELECT notes.rowid, books.label AS book_label,snippet(note_fts, 0, '<dnotehl>', '</dnotehl>', '...', 28) FROM note_fts INNER JOIN notes ON notes.rowid = note_fts.rowid INNER JOIN books ON notes.book_uuid = books.uuid WHERE note_fts MATCH 'note taking'" &> /dev/null

sqlite3 ~/.dnote/dnote.db -cmd ".timer ON"  &> /dev/null  5.95s user 0.10s system 99% cpu 6.085 total

# with pagination
▶ time sqlite3 ~/.dnote/dnote.db -cmd ".timer ON" "SELECT notes.rowid, books.label AS book_label, snippet(note_fts, 0, '<dnotehl>', '</dnotehl>', '...', 28) FROM note_fts INNER JOIN notes ON notes.rowid = note_fts.rowid INNER JOIN books ON notes.book_uuid = books.uuid WHERE note_fts MATCH 'note taking' LIMIT 50 OFFSET 50" &> /dev/null

sqlite3 ~/.dnote/dnote.db -cmd ".timer ON"  &> /dev/null  0.01s user 0.00s system 97% cpu 0.019 total

The find command has been the one command that I felt was missing from Dnote ever since its early days. I am happy to have finally implemented it. Now your knowledge is at your fingertips: that pesky snippet you are trying to remember, the word you are learning, class notes. I hope this command is as useful for you as it is for me.

Dnote newsletter

Get interesting stories and tips about learning, written by real people. Delivered to you occasionally.

No spam. Just stories. Unsubscribe anytime.

Sung Won Cho

I taught myself to code. I am making Dnote because we need to learn every day in this life.

Sydney, Australia https://sung.io
Find Notes with Full Text Search
Share on