html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,issue,performed_via_github_app
https://github.com/simonw/datasette/issues/1550#issuecomment-991761635,https://api.github.com/repos/simonw/datasette/issues/1550,991761635,IC_kwDOBm6k_c47HRTj,9599,2021-12-11T19:39:01Z,2021-12-11T19:39:01Z,OWNER,"I wonder if this could work for public instances too with some kind of queuing mechanism?
I really need to use benchmarking to figure out what the right number of maximum SQLite connections is. I'm just guessing at the moment.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1077628073,
https://github.com/simonw/datasette/issues/1550#issuecomment-991805516,https://api.github.com/repos/simonw/datasette/issues/1550,991805516,IC_kwDOBm6k_c47HcBM,9599,2021-12-11T23:43:24Z,2021-12-11T23:43:24Z,OWNER,"I built a tiny Starlette app to experiment with this a bit:
```python
import asyncio
import janus
from starlette.applications import Starlette
from starlette.responses import JSONResponse, HTMLResponse, StreamingResponse
from starlette.routing import Route
import sqlite3
from concurrent import futures
executor = futures.ThreadPoolExecutor(max_workers=10)
async def homepage(request):
return HTMLResponse(
""""""
SQL CSV Server
SQL CSV Server
""""""
)
def run_query_in_thread(sql, sync_q):
db = sqlite3.connect(""../datasette/covid.db"")
cursor = db.cursor()
cursor.arraysize = 100 # Default is 1 apparently?
cursor.execute(sql)
columns = [d[0] for d in cursor.description]
sync_q.put([columns])
# Now start putting batches of rows
while True:
rows = cursor.fetchmany()
if rows:
sync_q.put(rows)
else:
break
# Let queue know we are finished\
sync_q.put(None)
async def csv_query(request):
sql = request.query_params[""sql""]
queue = janus.Queue()
loop = asyncio.get_running_loop()
async def csv_generator():
loop.run_in_executor(None, run_query_in_thread, sql, queue.sync_q)
while True:
rows = await queue.async_q.get()
if rows is not None:
for row in rows:
yield "","".join(map(str, row)) + ""\n ""
queue.async_q.task_done()
else:
# Cleanup
queue.close()
await queue.wait_closed()
break
return StreamingResponse(csv_generator(), media_type='text/plain')
app = Starlette(
debug=True,
routes=[
Route(""/"", homepage),
Route(""/csv"", csv_query),
],
)
```
But.. if I run this in a terminal window:
```
/tmp % wget 'http://127.0.0.1:8000/csv?sql=select+*+from+ny_times_us_counties'
```
it takes about 20 seconds to run and returns a 50MB file - but while it is running no other requests can be served by that server - not even the homepage! So something is blocking the event loop.
Maybe I should be using `fut = loop.run_in_executor(None, run_query_in_thread, sql, queue.sync_q)` and then awaiting `fut` somewhere, like in the Janus documentation? Don't think that's needed though. Needs more work to figure out why this is blocking.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",1077628073,