Skip to main content

elasticsearch bulk add

Batch insert data into Elasticsearch
When I want to batch add documents to elasticsearch, this is how I do
I create an index:
curl -XPUT localhost:9200/tmp2
I prepare the file for batch insertion (official documentation)

{ "create" : { "_index" : "tmp2", "_id" : "1" } }
{"title": "RxJS: How to Use refCount","link": "https://blog.angularindepth.com/rxjs-how-to-use-refcount-73a0c6619a4e","text": "My previous article — Understanding the publish and share Operators — looked only briefly at the refCount method. Let’s look at it more closely here."}
{ "create" : { "_index" : "tmp2", "_id" : "12" } }
{"title": "I reverse-engineered Zones (zone.js) and here is what I’ve found","link": "https://blog.angularindepth.com/i-reverse-engineered-zones-zone-js-and-here-is-what-ive-found-1f48dc87659b","text": "Zones is a new mechanism that helps developers work with multiple logically-connected async operations. Zones work by associating each async operation with a zone."}
Save the file under the name "news.json"
I the perform the insertion:
curl -H "Content-Type: application/x-ndjson" -XPOST  localhost:9200/_bulk --data-binary "@news.json"
Pay attention to the fact that data must be one line: do not format you JSON data.
For example, this will lead to ERROR, because you put "\n" where Elasticsearch dont expect:

{ "create" : { "_index" : "tmp2", "_id" : "1" } }
{
 "title": "RxJS: How to Use refCount",
 "link": "https://blog.angularindepth.com/rxjs-how-to-use-refcount-73a0c6619a4e",
 "text": "My previous article — Understanding the publish and share Operators — looked only briefly at the refCount method. Let’s look at it more closely here."
}
{ "create" : { "_index" : "tmp2", "_id" : "12" } }
{
 "title": "I reverse-engineered Zones (zone.js) and here is what I’ve found",
 "link": "https://blog.angularindepth.com/i-reverse-engineered-zones-zone-js-and-here-is-what-ive-found-1f48dc87659b",
 "text": "Zones is a new mechanism that helps developers work with multiple logically-connected async operations. Zones work by associating each async operation with a zone."}
The ERROR will be:
{"error":
 {
  "root_cause":[
   {
    "type":"illegal_argument_exception",
    "reason":"Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]"
   }],
  "type":"illegal_argument_exception","reason":"Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]"
 },
  "status":400
}


Comments

Popular posts from this blog

dockerfile multiline to file

Outputing a multiline string from Dockerfile
I motsly use a Dockerfile by sourcing from a base ditribution: CentOS or Debian.
But I also have a local mirror and would like to use it for packages installation.

Espacially on CentOS it is about many lines to write to the /etc/yum.repos.d/CentOS-Base.repo file.

Easiest way: one RUN per line The first method that comes in mind is to issue one RUN per line to write.
Here you are:

RUN echo "[base] " > /etc/yum.repos.d/CentOS-Base.repo RUN echo "name=CentOS-$releasever - Base " >> /etc/yum.repos.d/CentOS-Base.repo RUN echo "baseurl=ftp://packages-infra.mg.rktmb.org/pub/centos/7/base-reposync-7 " >> /etc/yum.repos.d/CentOS-Base.repo RUN echo "gpgcheck=0 " >>…

Jira workflow for new projects

Associated workflow creation I'm a Jira Cloud user and begining from some version 6, I noticed that when I create a project, it automatically creates a Workflow and Issue Scheme that is prepended by the project key and which is a copy of the default scheme.
I always had to make a cleanup after creating a project. Default workflow for new projects I also miss a feature that would allow me to make a custom workflow (and globally custom project setting) the default for new projects I create.
Solution: Create with shared configuration While searching, I noticed that with Jira Cloud which is version 7.1.0 at the time I write, there is a link at the bottom of the "Create project" wizard:
"Create with shared configuration" will allow me to select the project I want the new one to share configuration with.

The new created project will use the same configuration as the project I selectThere will be no creation of Workflow and Issue Scheme that I need to cleanup

This fea…

vmware net_device trans_start

VMWare Workstation 12 and Kernel 4.7 When recompiling vmware kernel modules on a kernel 4.7, I get this error:

/tmp/modconfig-xrrZGZ/vmnet-only/netif.c:468:7: error: ‘struct net_device’ has no member named ‘trans_start’; did you mean ‘mem_start’?     dev->trans_start = jiffies;
This seems to be an already encountered problem: http://rglinuxtech.com/?p=1746http://ferenc.homelinux.com/?p=356 I choosed to replace the line, instead of deleting it.

- dev->trans_start = jiffies; + netif_trans_update(dev); I also noted that I had to re-tar the modified sources instead of leaving them untared, because the compilation process only takes the archives. 
On precedent editions of these files, I just left the modified folders "vmnet-only/" and "vmmon-only/" expanded without the need to re-tar them.