December 16, 2017 · solr search gcp

Getting Solr Running in 10 Minutes with Fusion

Fusion is a collection of commerical and Open Source software maintained by Lucidworks for the purpose of making a collection or groups of collections searchable by keywords, such as the ones you use on Google to find pages. The core of Fusion is Solr and Solr itself is based on Lucene.


There are many different use cases for search. For uses cases revolving around logs, Splunk, Loggly and Elastic may be typical choices. For use cases revolving around site search, solutions such as Algolia and Google's Search Appliance (discontinued) may be typical. The reality of implementing a given search solution gates on many factors, including size of index and speed of implementing the solution.

For a wide variety of use cases, including the ones listed above, Fusion may be the easiest to deploy and get running quickly. Using this guide, the power of Solr and Fusion signals and field exactions are only minutes away!

Scripted Fusion

This guide will enable you to quickly launch Fusion instances on Google's Cloud using a startup script written in Bash. With a bit of modification, this script can also be used to launch a Fusion instance on AWS. Feel free to pop into our Discord channel if you have any questions regarding using these scripts on AWS.

Here's the script we'll be using for reference:

if [ -n "$1" ]; then echo "found route named '$1'"; else echo "need to set route"; exit; fi

NEW_UUID=$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 4 | head -n 1)

gcloud compute instances create fusion-server-$NEW_UUID \
--machine-type "n1-standard-8" \
--image "ubuntu-1604-xenial-v20170811" \
--image-project "ubuntu-os-cloud" \
--boot-disk-size "50" \
--boot-disk-type "pd-ssd" \
--boot-disk-device-name "$NEW_UUID" \
--zone us-central1-a \
--labels ready=true \
--tags lucid \
--preemptible \
--metadata startup-script='#! /bin/bash
sudo su -
apt-get update -y
apt-get install default-jre -y
tar xvfz fusion*.tar.gz
/fusion/3.1.2/bin/fusion start

gcloud compute instances attach-disk fusion-server-$NEW_UUID --disk=fusion-data --zone us-central1-a

IP=$(gcloud compute instances describe fusion-server-$NEW_UUID | grep natIP | cut -d: -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')

# set up a proxy via wisdom for the APIs
curl -X DELETE$1
curl -X POST \
  --url \
  --data 'name='$1 \
  --data 'upstream_url=http://$IP:8764/' \
  --data 'uris='/$1 \
  | python -m json.tool

echo "Fusion UI available in a few minutes at: http://$IP:8764"
echo "API access available in a few minutes at:$1/api/..."

Note the instance we're starting will be an 8 core machine with 30GB of RAM. Fusion requires at least 8GB of RAM to run all of its required support services.

Checkout the Repo to Google's Cloud Console

Before we go any futher, you'll need to be logged into Google's Cloud Console. If you haven't signed up for Google Cloud yet, you may apply for a $300 free trial credit. If you've just signed up, you may also need to configure access to a few services via the API & Services section. Ensure you have the Google Cloud APIs configured for this guide before continuing and you are logged into the web shell:


To continue, let's check out the Lucidwork's streams repo into your console's container:

kordless@wisdom:~/demos$ git clone
Cloning into 'streams'...
remote: Counting objects: 2273, done.
remote: Compressing objects: 100% (918/918), done.
remote: Total 2273 (delta 1103), reused 2064 (delta 1103), pack-reused 208
Receiving objects: 100% (2273/2273), 14.29 MiB | 4.18 MiB/s, done.
Resolving deltas: 100% (1167/1167), done.

Change into the scripts directory:

$ cd streams/scripts

The instance you are going to create with the script is preemtable, which means sometime in the next 24 hours, Google will shut the instance off. If you want the instance to keep running, comment out this line like this:

$ vim start
# --preemptible \

Persistent Index?

If you would like the index to be persistent across different instance launches, create a Google Compute disk called fusion-disk (note the screencast creates one called 'fusion-disk-2'):


Next, un-comment the following line in to enable mounting that disk to the instance:

$ vi
# uncomment this line to enable mounting a non-ephemeral disk
# gcloud compute instances attach-disk fusion-server-$NEW_UUID --disk=fusion-data --zone us-central1-a

Starting Fusion

To start a Fusion instance run the following command with a simple lowercase string with no special characters as a parameter:

kordless@wisdom:~/$ cd streams/scripts
kordless@wisdom:/streams/scripts$ ./ myfusion

This results in the following output:

Fusion UI available in a few minutes at:
API access available in a few minutes at:

Note the APIs to the instance may be accessed via Wisdom's named APIs after the instance starts.

Here's the screencast for starting the instance:


Finally, after a few minutes of waiting, the instance should be available for access using the URL provided by the script. By default, Fusion serves its UI via the 8764 port.


Keep your eye out for more Fusion videos!