Getting Solr Running in 10 Minutes with Fusion
Fusion is a collection of commerical and Open Source software maintained by Lucidworks for the purpose of making a collection or groups of collections searchable by keywords, such as the ones you use on Google to find pages. The core of Fusion is Solr and Solr itself is based on Lucene.
There are many different use cases for search. For uses cases revolving around logs, Splunk, Loggly and Elastic may be typical choices. For use cases revolving around site search, solutions such as Algolia and Google's Search Appliance (discontinued) may be typical. The reality of implementing a given search solution gates on many factors, including size of index and speed of implementing the solution.
For a wide variety of use cases, including the ones listed above, Fusion may be the easiest to deploy and get running quickly. Using this guide, the power of Solr and Fusion signals and field exactions are only minutes away!
This guide will enable you to quickly launch Fusion instances on Google's Cloud using a startup script written in Bash. With a bit of modification, this script can also be used to launch a Fusion instance on AWS. Feel free to pop into our Discord channel if you have any questions regarding using these scripts on AWS.
Here's the script we'll be using for reference:
#!/bin/bash if [ -n "$1" ]; then echo "found route named '$1'"; else echo "need to set route"; exit; fi NEW_UUID=$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 4 | head -n 1) gcloud compute instances create fusion-server-$NEW_UUID \ --machine-type "n1-standard-8" \ --image "ubuntu-1604-xenial-v20170811" \ --image-project "ubuntu-os-cloud" \ --boot-disk-size "50" \ --boot-disk-type "pd-ssd" \ --boot-disk-device-name "$NEW_UUID" \ --zone us-central1-a \ --labels ready=true \ --tags lucid \ --preemptible \ --metadata startup-script='#! /bin/bash sudo su - apt-get update -y apt-get install default-jre -y wget https://download.lucidworks.com/fusion-3.1.2/fusion-3.1.2.tar.gz tar xvfz fusion*.tar.gz /fusion/3.1.2/bin/fusion start ' gcloud compute instances attach-disk fusion-server-$NEW_UUID --disk=fusion-data --zone us-central1-a IP=$(gcloud compute instances describe fusion-server-$NEW_UUID | grep natIP | cut -d: -f2 | sed 's/^[ \t]*//;s/[ \t]*$//') # set up a proxy via wisdom for the APIs curl -X DELETE https://api.wisdom.sh/api/$1 curl -X POST \ --url https://api.wisdom.sh/api/ \ --data 'name='$1 \ --data 'upstream_url=http://$IP:8764/' \ --data 'uris='/$1 \ | python -m json.tool echo "Fusion UI available in a few minutes at: http://$IP:8764" echo; echo "API access available in a few minutes at: https://api.wisdom.sh/$1/api/..."
Note the instance we're starting will be an 8 core machine with 30GB of RAM. Fusion requires at least 8GB of RAM to run all of its required support services.
Checkout the Repo to Google's Cloud Console
Before we go any futher, you'll need to be logged into Google's Cloud Console. If you haven't signed up for Google Cloud yet, you may apply for a $300 free trial credit. If you've just signed up, you may also need to configure access to a few services via the API & Services section. Ensure you have the Google Cloud APIs configured for this guide before continuing and you are logged into the web shell:
To continue, let's check out the Lucidwork's
streams repo into your console's container:
kordless@wisdom:~/demos$ git clone https://github.com/lucidworks/streams.git Cloning into 'streams'... remote: Counting objects: 2273, done. remote: Compressing objects: 100% (918/918), done. remote: Total 2273 (delta 1103), reused 2064 (delta 1103), pack-reused 208 Receiving objects: 100% (2273/2273), 14.29 MiB | 4.18 MiB/s, done. Resolving deltas: 100% (1167/1167), done.
Change into the scripts directory:
$ cd streams/scripts
The instance you are going to create with the script is preemtable, which means sometime in the next 24 hours, Google will shut the instance off. If you want the instance to keep running, comment out this line like this:
$ vim start # --preemptible \
If you would like the index to be persistent across different instance launches, create a Google Compute disk called
fusion-disk (note the screencast creates one called 'fusion-disk-2'):
Next, un-comment the following line in
start-fusion-gcp.sh to enable mounting that disk to the instance:
$ vi start-fusion-gcp.sh ... # uncomment this line to enable mounting a non-ephemeral disk # gcloud compute instances attach-disk fusion-server-$NEW_UUID --disk=fusion-data --zone us-central1-a
To start a Fusion instance run the following command with a simple lowercase string with no special characters as a parameter:
kordless@wisdom:~/$ cd streams/scripts kordless@wisdom:/streams/scripts$ ./start-fusion-gcp.sh myfusion
This results in the following output:
Fusion UI available in a few minutes at: http://22.214.171.124:8764 API access available in a few minutes at: https://api.wisdom.sh/myfusion/api/...
Note the APIs to the instance may be accessed via Wisdom's named APIs after the instance starts.
Here's the screencast for starting the instance:
Finally, after a few minutes of waiting, the instance should be available for access using the URL provided by the script. By default, Fusion serves its UI via the
Keep your eye out for more Fusion videos!