It is currently Sat Feb 11, 2012 12:18 pm

All times are UTC + 2 hours




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: [Massive Import] Connection Refused / Timed Out
PostPosted: Tue Feb 24, 2009 1:00 pm 
Offline

Joined: Wed Dec 10, 2008 3:27 pm
Posts: 4
Hi

I'm working on a OpenERP integration ( V5 ) but i'm quite in trouble.

The ERP have to work on both a partner and entities bases, wich
are provided by CSV files from other informatics system of the company.

This data import of these CSV files have to occur once a week, to keep
the data updated (so basically, it's just creating / updating data), but
the files are quite bigs (some are around 130M, which holds 300.000 entries).

I created wizards, to gave a simple way for the users to do this import, but
when processing, i face a "Timed out" error, since the socket timeout happen
before all the file have been proceeded.

Is their a way to change the socket timeout during the wizard execution ?

The alternative will be to cut the CSV files in small parts but if i can skip
this task, it would mean easier work for the users.

Thanks


Top
 Profile  
 
 Post subject:
PostPosted: Thu Mar 05, 2009 11:53 pm 
Offline

Joined: Tue Jun 10, 2008 8:58 am
Posts: 32
Location: The Netherlands / Bilthoven
An answer from an unexperienst newby. Be awhere!
On the last version I had a large import whitch repeatedlly failed. I do not remember the exact error code. The reason was not OpenERP. It was the max_expr_depth of de postgres database. Raising it (>10000) did the trick for me.

Good luck,

Peter


Top
 Profile  
 
 Post subject:
PostPosted: Fri Jun 12, 2009 10:32 am 
Offline

Joined: Thu Feb 19, 2009 2:33 pm
Posts: 9
Hi Spamart

I'm trying to load a massive data from CSV (30000 records) and me too get a
"conexion error"
"XML-RPC Error"
"('C', 'o', 'n', 'n', 'e', 'c', 't', 'i', 'o', 'n', ' ', 'r', 'e', 'f', 'u', 's', 'e', 'd', '!')"
but the records are loaded evenly, you have to wait 1 minute per 1000 records, more and less..

PS : Sorry for my English

_________________
Salu2.

Fdo : LEO


http://www.ting.es


Top
 Profile  
 
 Post subject: 1000 - 4000 records per second
PostPosted: Fri Dec 11, 2009 12:15 pm 
Offline

Joined: Fri Nov 06, 2009 4:26 am
Posts: 79
leovega wrote:
Hi Spamart

I'm trying to load a massive data from CSV (30000 records) and me too get a
...
but the records are loaded evenly, you have to wait 1 minute per 1000 records, more and less..


I just finish bulk uploading of more than 500000 records, the speed is somewhere betwen 1000 and 4000 records per second.
While the script is running .. i open pgadmin3 connection to the server. Look into targeted databse properties , and make row count refresh each second ... there is where the speed number come from.

My trick is :
1. create new regular (not openerp specific) table, import the csv into it.
2. Prepare multi pgsql and open erp connection
3. Arrrgggh ... toi make it short .. here is my dirty code



Code:
#!/usr/bin/python
import os, psycopg2, sys, xmlrpclib, string, time
from threading import Thread


dbname = sys.stdin.read().strip() #Replace with targeted openerp's database
dbuser = 'bino'
erpuser = 'admin'
erppasswd = 'admin'

def myfunc(ctr_code,ctr_name,db_con,erp_con):
   ctr_data = {
      'code': ctr_code,
      'name': ctr_name
   }
   ctrid = erp_con.execute(dbname, dbuid, erppasswd , 'pbx.country', 'create', ctr_data)
   #con_a = psycopg2.connect("dbname=%(vdbname)s user=%(vdbuser)s" %con_dict)
   cur_a = db_con.cursor()
   cur_a.execute("SELECT code,name FROM garea WHERE ccode =  %(xccode)s", dict(xccode=ctr_code))
   for area in cur_a.fetchall():
      area_code = area[0]
      area_name = area[1]
      if area[0] is None: #No Area code
         area_name = ctr_name +" (All Area)" #Means its the hole country
      area_data = {
         'ccode': ctrid,
         'code': area_code,
         'name': area_name
      }
      area_id = erp_con.execute(dbname, dbuid, erppasswd, 'pbx.area', 'create', area_data)
   cur_a.close()

#Tables for source
os.system("psql -d %s -U %s -f './prefx.sql'" % (dbname,dbuser))

#get UID
sock = xmlrpclib.ServerProxy('http://localhost:8069/xmlrpc/common',allow_none=1)
dbuid = sock.login(dbname, erpuser, erppasswd)
sock = xmlrpclib.ServerProxy('http://localhost:8069/xmlrpc/object',allow_none=1)

#("host=%s dbname=%s user=%s password=%s" % (DBHOST,DBDATABASE, DBUSER, DBPASSWORD))
#con_c = psycopg2.connect("dbname=%(v_dbname)s user=%(v_dbuser)s", serialize=0, dict(v_dbname=dbname v_dbuser=dbuser))
con_dict = {'vdbname': dbname , 'vdbuser': dbuser}
con_c = psycopg2.connect("dbname=%(vdbname)s user=%(vdbuser)s" %con_dict)
cur_c = con_c.cursor()
cur_c.execute("SELECT * FROM genctr")
#prepare multi connection
con_pool = []
erp_pool = []
for i in range(1, 7):
   con_pool.append(psycopg2.connect("dbname=%(vdbname)s user=%(vdbuser)s" %con_dict))
   erp_pool.append(xmlrpclib.ServerProxy('http://localhost:8069/xmlrpc/object',allow_none=1))
con_num = 0
for ctr in cur_c.fetchall():
   #Add one Country to pbx_country
   ctr_code = ctr[0]
   ctr_name = ctr[1]
   print ctr_code + " | " +ctr_name
   if con_num > 5:
      con_num = 0

   t = Thread(target=myfunc, args=(ctr_code,ctr_name,con_pool[con_num], erp_pool[con_num])) #Struct the threads
   t.start() # Start the thread

   con_num += 1
cur_c.close()



I run it with

Code:
bino@erp:~/mydoc/openerp/smdr$ echo pbx01 |./uldata.py


Note (1):
pbx01 ==> is my openerp database name
uldata.py ==> the name of my script

Note(2): This script importing 2 CSV to 2 openerp tables that have one2many relationship.

Note(3) : This script is not general purpose, so it's not safe to copy paste this code ...

Sincerely
-bino-


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC + 2 hours


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:

Protected by Anti-Spam ACP