December 5: Indexing for Better Search Performance - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Comments
December 5: Indexing for Better Search Performance
You must login to participate in this chat. Please login.

OK everyone -- thank you for your time.  This will conclude our chat session and the series.  Have a great weekend!

Apprentice

You're welcome, thanks for attending!

Apprentice

Cool! Thanks Greg and Mohammed! Thanks for the great week of interesting discussion!

Apprentice

The official release date hasn't been announced, but we are currently in the final phase of early access, which is available to anyone.  Check out:    ea dot marklogic dot com

Apprentice

It seems like the MLJS is really streamlining the process for devs. When is ML8 coming out?

Apprentice

In simple terms MLJS is MarkLogic Javascript Driver sothat you can access MarkLogic 7 using JavaScript. As Greg pointed out, in ML8 javascript is handled natively within database as well as middle-tier server

Apprentice

MLJS is a wrapper built on top of the MarkLogic REST API.  Its an open source project enabling developers to work with MarkLogic using JavaScript.  In the next release, MarkLogic 8, there will be an official Node.js API as well as built in server side JavaScript.  Very exciting stuff for the developer audience

Apprentice

Okay, another one that might be going further afield, can you tell me a little about MLJS Workplace? (I follow the blog of a MarkLogic employee, so not sure if most non-developers would be interested)

Apprentice

hah, fair enough, but thanks re: Profiler

Apprentice

There are also api functions that let a developer expose a query plan, which can be very useful for performance tuning

Apprentice

RE: the profiler.  MarkLogic has a built in profiler for optimizing queries.  We have tools that expose that profile data to show you how the query is being resolved and any slow portions.  I'd encourage you to take a look at our whitepaper "Inside MarkLogic Server" (just do a google search on that) for a good read on some more details...that would take me too long to type in here.

Apprentice

Indexing setup is super easy.  When you say "propietary index", I translate that as meaning a custom range index or field based on your unique data.  That can be setup with a couple clicks.  Then once setup, you can export the entire database configuration making it simple to migrate those configurations throughout a cluster or between environments (Dev to Test, Test to Prod for example)

Apprentice

Also, can you explain a little bit more about the Profiler? 

Apprentice

But the clients can also set up their own proprietary indexes... how much time would that take in general? Like, best guess for a standard indexing project to set up?

Apprentice

And yes, all our indexes are available to all our clients.

Apprentice

The universal index requires no setup.  Its implemented out of the box.  All you need to do is load your data and it gets indexed.  The additional indexes like case/diacritics, positions, wildcards, geospatial...you just turn them on for a database and they automatically get created.  Turning them on can be done via an Admin GUI or via REST calls 

Apprentice

So, the big thing I'm having a hard time gronking with indexes... how much set up work is involved on these? You mentioned that you had a big index that was used by the FBI originally. Is that for use by all clients?

Apprentice

Yes, I guess that pretty much answers my question... of course you can.

Apprentice

Can you clarify for me what you mean by virtualize?  Do you mean run MarkLogic on VMs?

Apprentice

Interesting! Can you virtualize those?

Apprentice

The indexes are smart and use 64 bit hashing algorithms -- which reduces space.  We're not storing strings like the examples on my slide, which look like a spreadsheet.  That's just useful for us as humans to think about the indexes.  

Apprentice

@Wendy:  Sure -- at a high level, the more indexes that are turned on for a DB, the more disk space the indexes will require.  The best way to determine the disk space requirement is to load a sample of your data with the indexes you desire and get a baseline.

Apprentice

Craig, can you discuss the disk space considerations you mentioned on slide 16 a little more? 

Apprentice

Hi all -Audio is live! If you don't see the audio bar at the top of the screen, please refresh your browser. It may take a couple tries. When you see the audio bar, if it doesn't start automatically, hit the play button. If you experience audio interruptions and are using IE, try using FF or Chrome as your browser. Many people experience issues with IE. Also, make sure your flash player is updated with the current version. Some companies block live audio streams, so if that is the case for your company, the class will be archived on this page immediately following the class and you can listen then. People don't experience any issues with the audio for the archived version.

Apprentice

We're showing about three minutes to go -- we're glad everyone is here for the final class in this course!

Strategist

The session will begin at 2 pm, and it will be broadcasted live, so you will hear it from this location.

Apprentice

You seem to be signed in already, Trivial2, since you're able to comment.

Apprentice

where do I sign in for this? I am already registered

Apprentice

We'd love to have your voice in the discussion here. To take part, just type your comment or question into the "Your Post" box and then click on the "Post" button below the box. Feel free to introduce yourself before the class starts -- I think you'll find that we're a very friendly community here! 

Strategist

Hey, everyone, we're glad you could join us! When the class is scheduled to start, an audio player should appear above the "Your Post" window. If it doesn't appear, you might need to refresh your browser until it does. If it appears but doesn't start playing, then you may need to click on the "play" button on the far left of the player. 

Strategist


The State of Cloud Computing - Fall 2020
The State of Cloud Computing - Fall 2020
Download this report to compare how cloud usage and spending patterns have changed in 2020, and how respondents think they'll evolve over the next two years.
News
Top 10 Data and Analytics Trends for 2021
Jessica Davis, Senior Editor, Enterprise Apps,  11/13/2020
Commentary
Where Cloud Spending Might Grow in 2021 and Post-Pandemic
Joao-Pierre S. Ruth, Senior Writer,  11/19/2020
Slideshows
The Ever-Expanding List of C-Level Technology Positions
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/10/2020
Register for InformationWeek Newsletters
Video
Current Issue
Why Chatbots Are So Popular Right Now
In this IT Trend Report, you will learn more about why chatbots are gaining traction within businesses, particularly while a pandemic is impacting the world.
White Papers
Slideshows
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll