kb

View the Project on GitHub smcnally/kb

DAM - es, tesseract, fscrawler, kibana

setup in order on FtB

JDK

https://www.linode.com/docs/guides/how-to-install-openjdk-on-ubuntu-20-04/

`echo $JAVA_HOME`

/usr/lib/jvm/java-11-openjdk-amd64

`export ES_JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"`

elasticsearch

defaults

http://localhost:5601/app/management/kibana/indexPatterns/patterns/e98412c0-6e62-11ec-a132-5f15e23547c9#/?_a=(tab:indexedFields)

curl http://localhost:9200/_aliases

curl ‘localhost:9200/_cat/indices?v’

add ingest attachment sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install ingest-attachment

tesseract-ocr

sudo apt install tesseract-ocr

really – that’s all there is and “just” this is the balls:

* smcnally@FredTheBeast:~/Downloads$ tesseract 'Screenshot 2022-01-06 at 12-03-37 Steve McNally LinkedIn.png' stdout

* this OCRs the screenshot in < 1sec and outputs the text to the terminal  

fscrawler

https://github.com/dadoonet/fscrawler

~/fscrawler-es7-2.7-SNAPSHOT/bin ~/.fscrawler/

https://fscrawler.readthedocs.io/en/fscrawler-2.8/admin/cli-options.html#cli-options

~/fscrawler-es7-2.7-SNAPSHOT/bin$ ./fscrawler ftb-screencaps --loop 1

Previous write-up

https://meanbusiness.com/2020/11/17/elasticsearch-and-fscrawler/