Support for BINARY and VARBINARY in Database Schema

Created on 11 February 2010, over 14 years ago
Updated 13 February 2023, over 1 year ago

Problem/Motivation

Schema is missing generic BINARY and VARBINARY support (especially mysql); this prevents leveraging those DB-level facilities for cases where efficiency (query lookup time, storage usage) is highly desired.

Hex hashes are widely used, for many kinds of situations; they are just string representations of bytes; storing a hash as a VARCHAR (even if using the binary option) does not provide the same level of speed or reduced storage usage as just leveraging DB-level binary data structures. For a MySQL-specific analysis, see this blog post.

Actual measurements by @mfb:

... storing e.g. a SHA-256 hash in a UTF-8 CHAR(64) column uses six times more bytes than a BINARY(32) column.

... storing a hash in an ASCII VARCHAR(64) column is a big improvement, but still uses more than twice as much space as a BINARY(32) column

Here are some real-world numbers, for 4000 rows of SHA-256 hashes as primary key plus an additional index*.

CREATE TABLE `binhash` ( `hash` binary(32) NOT NULL, PRIMARY KEY (`hash`), KEY `hashkey` (`hash`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

binary(32) size on disk: 640K

CREATE TABLE `aschash` ( `hash` varchar(64) CHARACTER SET ascii NOT NULL, PRIMARY KEY (`hash`), KEY `hashkey` (`hash`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

varchar(64) ascii size on disk: 9.0M

Steps to reproduce

N/A, this is for new functionality.

Proposed resolution

  1. Add schema support for BINARY and VARBINARY columns.
  2. Augment database abstraction layer so that binary data can be easily used for all types of queries, for all supported DB engines.

Remaining tasks

  1. (see #50, #52).
  2. Finish test coverage for schema support.
  3. TBD: add support for proper conversion to/from binary for all types of queries. See #22

User interface changes

None.

API changes

  • .
  • Proper helpers (escaping, handling of default value, etc).

Data model changes

None

Release notes snippet

TBD

Feature request
Status

Needs work

Version

10.1

Component
Database 

Last updated less than a minute ago

  • Maintained by
  • 🇳🇱Netherlands @daffie
Created by

🇩🇪Germany ebruts

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024