From two minutes to less than half a second!
Data written to an unlogged table will not be logged to the write-ahead-log (WAL), making it ideal for intermediate tables. Note that <span style="color: #010101;">UNLOGGED</span>
tables will not be restored in case of a crash, and will not be replicated.
.. Copy Data From a String Iterator with Buffer Size▶
In an attempt to squeeze one final drop of performance, we notice that just like <span style="color: #010101;">page_size</span>
, the <span style="color: #010101;">copy</span>
command also accepts a similar argument called <span style="color: #010101;">size</span>
:
size – size of the buffer used to read from the file.
Let’s add a <span style="color: #010101;">size</span>
argument to the function:
@profile
def copy_string_iterator(connection, beers: Iterator[Dict[str, Any]], size: int = 8192) -> None:
with connection.cursor() as cursor:
create_staging_table(cursor)
beers_string_iterator = StringIteratorIO((
'|'.join(map(clean_csv_value, (
beer['id'],
beer['name'],
beer['tagline'],
parse_first_brewed(beer['first_brewed']).isoformat(),
beer['description'],
beer['image_url'],
beer['abv'],
beer['ibu'],
beer['target_fg'],
beer['target_og'],
beer['ebc'],
beer['srm'],
beer['ph'],
beer['attenuation_level'],
beer['brewers_tips'],
beer['contributed_by'],
beer['volume']['value'],
))) + '\n'
for beer in beers
))
cursor.copy_from(beers_string_iterator, 'beers', sep='|', size=size)
The default value for size is 8192, which is <span style="color: #010101;">2 ** 13</span>
, so we will keep sizes in powers of 2:
>>> copy_string_iterator(connection, iter(beers), size=1024)
copy_string_iterator(size=1024)
Time 0.4536
Memory 0.0
>>> copy_string_iterator(connection, iter(beers), size=8192)
copy_string_iterator(size=8192)
Time 0.4596
Memory 0.0
>>> copy_string_iterator(connection, iter(beers), size=16384)
copy_string_iterator(size=16384)
Time 0.4649
Memory 0.0
>>> copy_string_iterator(connection, iter(beers), size=65536)
copy_string_iterator(size=65536)
Time 0.6171
Memory 0.0