issue_comments: 905021010

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	issue	performed_via_github_app
https://github.com/simonw/sqlite-utils/issues/319#issuecomment-905021010	https://api.github.com/repos/simonw/sqlite-utils/issues/319	905021010	IC_kwDOCGYnMM418YZS	66709385	2021-08-24T22:33:42Z	2021-08-24T22:33:42Z	NONE	Oh, I misread. Yes some files will not be valid UTF-8, I'd throw a warning and continue (not adding that file) but if you want to get more elaborate you could allow to define a policy on what to do. Not adding the file, index binary content or use a conversion policy like the ones available on Python's decode. From https://stackoverflow.com/questions/24616678/unicodedecodeerror-in-python-when-reading-a-file-how-to-ignore-the-error-and-ju : - 'ignore' ignores errors. Note that ignoring encoding errors can lead to data loss. - 'replace' causes a replacement marker (such as '?') to be inserted where there is malformed data. - 'surrogateescape' will represent any incorrect bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the surrogateescape error handler is used when writing data. This is useful for processing files in an unknown encoding. - 'xmlcharrefreplace' is only supported when writing to a file. Characters not supported by the encoding are replaced with the appropriate XML character reference &#nnn;. - 'backslashreplace' (also only supported when writing) replaces unsupported characters with Python’s backslashed escape sequences.	{"total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0}	976399638