Quantcast
Channel: Mockaroo Community Forum - Latest posts
Viewing all articles
Browse latest Browse all 2665

Lookup using from_dataset() regularly times out

$
0
0

I have a data set of 5 columns and about 87,000 lines, and I am trying to look up codes in it using the Diagnosis Code column. With only one such column in my schema, it often outputs correctly, but sometimes I get time-out errors. When I have two such columns, it always times out:

ClaimId,LineNumber,Icd10ProcedureCode,HcpcsCptCode
1000003,1,Error: Timed out,Error: Timed out
1000003,2,Error: Timed out,Error: Timed out
1001002,1,Error: Timed out,Error: Timed out
1001002,2,Error: Timed out,Error: Timed out
1002002,1,Error: Timed out,Error: Timed out
...

Sometimes both columns time out, and other times only one does.

Here is the schema with two similar lookups using from_dataset():
Test Claim Line

Here is a simpler schema with only one columns that usually times out, though less often:
Test Procedure Code Lookup

UPDATE:

I have reduced the lookup data set to 10,000 rows, and the problem has subsided. I hope this post helps others avoid this problem. Of course I also had to regenerate/upload all the data sets for which I was performing the lookup, as they were also generated using the same data set. I wonder what the true high limit is for lookup table size.


Viewing all articles
Browse latest Browse all 2665

Trending Articles