On Github thoronas / thoronas.github.io
By Flynn O’Connor - @thoronas
Tracking and analyzing user search behaviour we can learn:
Cheapest and easiest way is via Google analytics
There is a lot of things search analytics can provide beyond just basic search queries. (Discuss bounce rates, ) Search analytics are relatively easy to set up for your existing site, especially if you're already using Google Analytics. There are otherWordPress search query by default use "s" as a parameter. That is the search parameter in wp_query.
Provide GA with the search query parameter which for WordPress is "s" 99.9% of the time. Look at commonly submitted queries and determine if the results are what you are intending. E-commerce sites can determine if top results are leading to pages that you aren't able to monetize and in turn losing money. It can also show you how much money people are spending on your site when using site search and when they aren’t.Google analytics provides fantastic in-depth search stats.
For more information about how to analyze site search data read Internal Site Search Analysis: Simple, Effective, Life Altering! by Avinash Kaushik
I highly recommend you read Internal Site Search Analysis on a list apart. It gives you a good overview of how to utilize the google analytics data, even if it’s 5 years old the information is still very relevant and will probably make you really start to appreciate what an amazing tool internal site search analysis can be.Dr. Andrei Broder wrote a report called a Taxonomy of Web Search. He determined there are three broad "types" of search queries.
Common patterns that result in counterproductive search experiences.
Search in WordPress doesn't offer a lot to improve the situaion. Sometimes as developers and designers we excacerbate the situation with our decisions or lack of thought about the patters our users go through.If a user is unfamiliar with your content they can enter key words that might not reflect their intentions. Trying variations of flawed search terms return poor results, leads to frustrated users.
When a user is constantly jumping back and forth between SERP and individual results.
Your search engine results page (SERP) needs to be structured in a way that helps users find the results they're looking for.
Help users find exactly what they are looking for faster.
WP Search Suggest by Konstantin Obenland
Guide the user when they are exploring your content.
Relevanssi by Ville Saari
Control weighting of your content relevance.
Search meta, taxonomies, custom post types.
Keyword stemming
Search WP by Jonathan Christopher
Creating tokens out of the root of words
"Computers, Computing, Computes"
[Comput]
Filter search results by taxonomies or meta data.
Facet WP by Matt Gibbs
Eventually performance issues become too difficult to overcome.
Third party solutions that hijack WordPress search functionality are a better solution
Before we dive into Elasticsearch, are there questions so far?
Make remote requests from WordPress to Elasticsearch
$url = 'http://local.wordpress.dev:9200/{index}/{type}/{action}'; $args = array('method' => 'GET'); $response = wp_remote_request( $url, $args );
Creating a basic index in Elasticsearch.
$url = 'http://local.wordpress.dev:9200/wp-index'; $args = array('method' => 'PUT'); $response = wp_remote_request( $url, $args );
Specify a document type called posts in our index.
$mapping = array( 'mappings' => array( 'post' => array( // post property field mappings go here. ) ), 'settings' => array( // custom settings & analysis goes here ) );
'post' => array( 'properties' => array( 'post_title' => array( 'type' => 'string' ), 'post_content' => array( 'type' => 'sting' ), 'post_id' => array( 'type' => 'long' ), 'post_date' => array( 'type' => 'date', 'format' => 'YYYY-MM-dd HH:mm:ss', ) ) )
Elasticsearch also supports array, object, and multi-field types.
'post_author' => array( 'type' => 'multi_field', 'fields' => array( 'author' => array( 'type' => 'string' ), 'author_raw' => array( 'type' => 'string', 'index' => 'not_analyzed' ) ) ) )
What happens if you add new data fields?
Example: Custom Taxonomies
"dynamic_templates" => array( array( "template_terms" => array( "path_match" => "terms.*", "mapping" => array( "type" => "object", "properties" => array( "name" => array( "type" => "string" ), "term_id" => array( "type" => "long" ) ) ) ) ) ) )
Index & mapping done we can now populate with content
To do so we need to do the following:
Get the posts within WordPress JSON encode the post data Send the post data to ElasticsearchMatch the post data to our mapping.
$post_for_ES = array( 'post_title' => get_the_title(), 'post_content' => get_the_content(), 'post_id' => get_the_ID(), 'post_date' => get_the_date() );
Once the posts have been encoded we have two options for sending them to posts
$url = 'http://local.wordpress.dev:9200/wp-index/post/1'; $post_content = json_encode($post_for_ES); $args = array('method' => 'PUT', 'body' => $post_content); $response = wp_remote_request( $url, $args );
In order to hijack WordPress default search functionality we need to do the following:
Use pre_get_posts to capture search query.
Query Elasticsearch and return array of post id's
function search_filter($query) { if ( !is_admin() && $query->is_main_query() && $query->is_search ) { $search_query = stripslashes( get_search_query( false ) ); $elasticsearch_posts = elasticsearch_function($search_query); set_query_var( 'post__in', $elasticsearch_posts); } } add_action('pre_get_posts','search_filter');
Make sure to nuke the default WordPress search.
function clear_sql_search_clause( $search ) { if( is_search() && ! is_admin() ) { $search = ''; } return $search; } add_filter( 'posts_search', 'clear_sql_search_clause');
Querying documents in Elasticsearch utilizes several APIs that are nested in the Search API:
Querying multiple fields is a common requirement.
Example: query post title and post content.
$ES_query = array( 'query' => array( // specify query type 'multi_match' => array( // the query term 'query' => 'beer', // what fields to search through 'fields' => array('post_title^2', 'post_content') ) ) );
Take the constructed query, pass it to HTTP API.
//search our wp-index $url = "http://local.wordpress.dev:9200/wp-index/_search"; $method = "POST"; //pass the query we constructed in the previous slide $body = json_encode($ES_query); $arg = array ( 'method' => $method, 'body' => $body); $request = wp_remote_request ($url, $arg);
Run a nested query after applying filters
$ES_query = array( 'query' => array( 'filtered' => array( 'query' => array( 'multi_match' => array( 'query' => 'beer', 'fields' => array('post_title^2', 'post_content') ) ), 'filter' => array( 'term' => array( "post_author.author_raw" => "Flynn" ) ) ))));
We've queried Elasticsearch and got results!
We need to parse the results.
//our Elasticsearch query from previous slides $request = wp_remote_request ($url, $arg); //grab the body of request which has the found posts $results = json_decode(wp_remote_retrieve_body($request)); //pass the results into variable. $hits = $results->hits->hits; // do what you want with the data from here.
Some examples of Elasticsearch functionality
When querying Elasticsearch you can specify particular data to be returned as facets or aggregations.
Useful for getting aggragate date on:
WordPress.com related posts available through Jetpack.
Swiftype provides managed Elasticsearch functionality.
Customize either using Elasticsearch API's