Large Import

How long is it expected to import 1 TB of logs into Piwik? Are there any suggestions I can consider in regards to performance tuning for such a large import on a system with 16 GB RAM, 16 Cores, and enterprise storage?

Imports about 300 lines per sec so will depend a lot on your server, your logs, your piwik server, etc.

The system I have has 16 cores, 16 GB of RAM. It is attached to tier-1 symmetrix storage. The sql binaries, logs, data, backups are all on separate luns. Doing a bulk import is like watching paint dry. How is one expected to say this is a replacement for Urchin Web Analytics when I can’t even accomplish the same thing in a timely manner? For example:

A. Urchin Web Analytics: Imports logs for one month in 12-18 hours.
B. Piwik imports logs from several days in 24 hours.

What gives with the poor performance? (HEAD-HIT-KEYBOARD)

So, I did a benchmark between Urchin and Piwik. In 12 hours, I was able to import 45 days worth of analytics into Urchin (on a machine that is 6 years old running Windows Server 2003R2 with 4GB of ram, and 2 Xeon CPUs). As of the same 12 hours, I have been only able to import 5 days worth of logs into Piwik. Is there perhaps something I am doing wrong.

Here is my new environment that I am using to replace the Urchin Web Analytics:

Windows Server 2008R2
IIS 7
PHP 5.4.11
MySQL 5.5.29
Piwik 1.10.1
2xXeon CPU X5687
16 GB RAM

Storage:
C: OS (Local, RAID1)
E: MySQL Binaries (Symmetrix Tier-1 Storage)
F: MySQL Bin Logs (Symmetrix Tier-1 Storage)
G: MySQL Data Files (Symmetrix Tier-1 Storage)
H: MySQL Backup Files (Symmetrix Tier-1 Storage)

Is there any additional tuning for mysql or IIS that I should be performing?

my.ini:

The following options will be passed to all MySQL clients

[client]
#password = your_password
port = 3306
socket = /tmp/mysql.sock

Here follows entries for some specific programs

The MySQL server

[mysqld]
basedir=E:/mysql
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 16M
max_allowed_packet = 1M
table_open_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
datadir=G:\Data1\MYSQL_DATA\DATA
innodb_file_per_table

Replication Master Server (default)

binary logging is required for replication

log-bin=F:\Logs1\MYSQL_LOG\mysql-bin

binary logging format - mixed recommended

binlog_format=mixed

required unique id between 1 and 2^32 - 1

defaults to 1 if master-host is not set

but will not function as a master if omitted

server-id = 1

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

Remove the next comment character if you are not familiar with SQL

#safe-updates

[myisamchk]
key_buffer_size = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M

[mysqlhotcopy]
interactive-timeout

how many request per second can you import? generally can process 200 requests per second

I’m normally processing about 383records/sec (avg) / 3,200/sec (current). What I can’t figure out is the machine is idle during the “0 records/sec”.

31969708 lines parsed, 7168280 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31979634 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 3239 records/sec (current)
31982620 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31982620 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31982620 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31982620 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31982620 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31982620 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31982620 lines parsed, 7171519 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31993353 lines parsed, 7174751 lines recorded, 383 records/sec (avg), 3232 records/sec (current)
31995582 lines parsed, 7174751 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31995582 lines parsed, 7174751 lines recorded, 383 records/sec (avg), 0 records/sec (current)
31995582 lines parsed, 7174751 lines recorded, 383 records/sec (avg), 0 records/sec (current)

I also made some adjustments to my my.ini/my.cnf file:

$ cat /cygdrive/e/mysql/my.ini | grep -v ^#
[client]
port = 3306
socket = /tmp/mysql.sock

[mysqld]
basedir=E:/mysql
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 384M
max_allowed_packet = 1M
table_open_cache = 512
sort_buffer_size = 2M
net_buffer_length = 8K
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 8M
datadir=G:\Data1\MYSQL_DATA\DATA
innodb_file_per_table

table_cache=1024
tmp_table_size=6G
max_heap_table_size=6G
thread_cache=16
query_cache_size=1G
query_cache_limit=4M
expire_logs_days=5
max_binlog_size=1024M
skip-name-resolve
innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size=8G
innodb_additional_mem_pool_size=100M
innodb_log_buffer_size=18M

log-bin=F:\Logs1\MYSQL_LOG\mysql-bin

binlog_format=mixed

server-id = 1

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

[myisamchk]
key_buffer_size = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M

[mysqlhotcopy]
interactive-timeout

…I’m using cygwin to process the log files since the cmd.exe doesn’t properly expand “*” and powershell you need to write a book to do a simple while loop. …and yes, I’m stuck on WinDOS.

I was also getting error 500 on occasion. This is what I did to fix it:

  1. Open a command in this directory: c:\Windows\System32\inetsrv
  2. Execute: appcmd list config -section:system.webServer/fastCgi

Output should look like this:

<system.webServer>






</system.webServer>

  1. Change the timeout to 1 hour, execute:
    appcmd set config -section:system.webServer/fastCgi "-[fullPath=‘C:\php\php-cgi.exe’].activityTimeout:3600"
    appcmd set config -section:system.webServer/fastCgi “-[fullPath=‘C:\php\php-cgi.exe’].maxInstanceRequests:10000"
    appcmd set config -section:system.webServer/fastCgi /+”[fullPath=‘C:\php\php-cgi.exe’].environmentVariables.[name=‘PHP_FCGI_MAX_REQUESTS’, value=‘10000’]

Output from command will show:
Applied configuration changes to section “system.webServer/fastCgi” for “MACHINE/WEBROOT/APPHOST” at configuration commit path “MACHINE/WEBROOT/APPHOST”

  1. Execute: appcmd list config -section:system.webServer/fastCgi

Verify the settings are committed:

<system.webServer>







</system.webServer>

  1. Execute: iisreset

have you tried different batch sizes etc> machine is not idle but it is importing 500 page views at once in piwik

Yes, I have. The amount of time to process is no different and sometimes results in server 500 server unavailable. This is why I had started to tune IIS/PHP/FCGI. Even with those changes, it doesn’t seem to make any difference.