You can use Logstash to collect data and migrate collected data to Elasticsearch in CSS. This method helps you effectively obtain and manage data through Elasticsearch. Data files can be in the JSON or CSV format.
Logstash is an open-source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to Elasticsearch. For details about Logstash, visit the following website: https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html
The following two scenarios are involved depending on the Logstash deployment:
Logstash requires an OSS version same as the CSS version.
Figure 1 illustrates how data is imported when Logstash is deployed on an external network.
ssh -g -L <Local port of the jump host:Private network address and port number of a node> -N -f root@<Private IP address of the jump host>
For example, port 9200 on the jump host is assigned external network access permissions, the private network address and port number of the node are 192.168.0.81 and 9200, respectively, and the private IP address of the jump host is 192.168.0.227. You need to run the following command to perform port mapping:
ssh -g -L 9200:192.168.0.81:9200 -N -f root@192.168.0.227
For example, data file access_20181029_log needs to be imported, the file storage path is /tmp/access_log/, and the data file includes the following data:
Create the access_log folder if it does not exist.
| All | Heap used for segments | | 18.6403 | MB | | All | Heap used for doc values | | 0.119289 | MB | | All | Heap used for terms | | 17.4095 | MB | | All | Heap used for norms | | 0.0767822 | MB | | All | Heap used for points | | 0.225246 | MB | | All | Heap used for stored fields | | 0.809448 | MB | | All | Segment count | | 101 | | | All | Min Throughput | index-append | 66232.6 | docs/s | | All | Median Throughput | index-append | 66735.3 | docs/s | | All | Max Throughput | index-append | 67745.6 | docs/s | | All | 50th percentile latency | index-append | 510.261 | ms |
cd /<Logstash installation directory>/ vi logstash-simple.conf
input {
Location of data
}
filter {
Related data processing
}
output {
elasticsearch {
hosts => "<EIP of the jump host>:<Number of the port assigned external network access permissions on the jump host>"
(Optional) If communication encryption has been enabled on the cluster, you need to add the following configuration:
ssl => true
ssl_certificate_verification => false
}
}
Consider the data files in the /tmp/access_log/ path mentioned in 4 as an example. Assume that data import starts from data in the first row of the data file, the filtering condition is left unspecified (indicating no data processing operations are performed), the public IP address and port number of the jump host are 192.168.0.227 and 9200, respectively, and the name of the target index is myindex. Edit the configuration file as follows, and enter :wq to save the configuration file and exit.
input {
file{
path => "/tmp/access_log/*"
start_position => "beginning"
}
}
filter {
}
output {
elasticsearch {
hosts => "192.168.0.227:9200"
index => "myindex"
}
}
If a license error is reported, set ilm_enabled to false.
If the cluster has the security mode enabled, you need to download a certificate first.

input{
file {
path => "/tmp/access_log/*"
start_position => "beginning"
}
}
filter {
}
output{
elasticsearch{
hosts => ["https://192.168.0.227:9200"]
index => "myindex"
user => "admin"
password => "******"
cacert => "/logstash/logstash6.8/config/CloudSearchService.cer"
}
}
password: password for logging in to the cluster
./bin/logstash -f logstash-simple.conf
This command must be executed in the directory where the logstash-simple.conf file is stored. For example, if the logstash-simple.conf file is stored in /root/logstash-7.1.1/, go to the directory before running the command.

On the Console page of Kibana, enter the following command to search for data. View the search results. If the searched data is consistent with the imported data, the data has been imported successfully.
GET myindex/_search
Figure 4 illustrates how data is imported when Logstash is deployed on an ECS that resides in the same VPC as the cluster to which data is to be imported.
| All | Heap used for segments | | 18.6403 | MB | | All | Heap used for doc values | | 0.119289 | MB | | All | Heap used for terms | | 17.4095 | MB | | All | Heap used for norms | | 0.0767822 | MB | | All | Heap used for points | | 0.225246 | MB | | All | Heap used for stored fields | | 0.809448 | MB | | All | Segment count | | 101 | | | All | Min Throughput | index-append | 66232.6 | docs/s | | All | Median Throughput | index-append | 66735.3 | docs/s | | All | Max Throughput | index-append | 67745.6 | docs/s | | All | 50th percentile latency | index-append | 510.261 | ms |
cd /<Logstash installation directory>/ vi logstash-simple.conf
input {
Location of data
}
filter {
Related data processing
}
output {
elasticsearch{
hosts => "<Private network address and port number of the node>"}
(Optional) If communication encryption has been enabled on the cluster, you need to add the following configuration:
ssl => true
ssl_certificate_verification => false
}
If the cluster contains multiple nodes, you are advised to replace the value of <Private network address and port number of a node> with the private network addresses and port numbers of all nodes in the cluster to prevent node faults. Use commas (,) to separate the nodes' private network addresses and port numbers. The following is an example:
hosts => ["192.168.0.81:9200","192.168.0.24:9200"]
If the cluster contains only one node, the format is as follows:
hosts => "192.168.0.81:9200"
Consider the data files in the /tmp/access_log/ path mentioned in 2 as an example. Assume that data import starts from data in the first row of the data file, the filtering condition is left unspecified (indicating no data processing operations are performed), the private network address and port number of the node in the cluster where data is to be imported are 192.168.0.81 and 9200, respectively, and the name of the target index is myindex. Edit the configuration file as follows, and enter :wq to save the configuration file and exit.
input {
file{
path => "/tmp/access_log/*"
start_position => "beginning"
}
}
filter {
}
output {
elasticsearch {
hosts => "192.168.0.81:9200"
index => "myindex"
}
}
If the cluster has the security mode enabled, you need to download a certificate first.

input{
file {
path => "/tmp/access_log/*"
start_position => "beginning"
}
}
filter {
}
output{
elasticsearch{
hosts => ["https://192.168.0.227:9200"]
index => "myindex"
user => "admin"
password => "******"
cacert => "/logstash/logstash6.8/config/CloudSearchService.cer"
}
}
password: password for logging in to the cluster
./bin/logstash -f logstash-simple.conf

On the Console page of Kibana, enter the following command to search for data. View the search results. If the searched data is consistent with the imported data, the data has been imported successfully.
GET myindex/_search