Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to save in an external database #305

Open
wilmardo opened this issue Feb 3, 2021 · 16 comments
Open

Make it possible to save in an external database #305

wilmardo opened this issue Feb 3, 2021 · 16 comments
Labels
enhancement New feature or request

Comments

@wilmardo
Copy link

wilmardo commented Feb 3, 2021

Same issue as Koenkk/zigbee2mqtt#4947

As stated in the linked issue this would be a welcome addition for users who write on SD card like storage.

My usecase is a bit specific but nonetheless a bit of explanation.
I am trying to get a stateless Zigbee2MQTT implementation so it can failover between Kubernetes nodes (external CC2530 TCP serial) without the need of a volume (since distributed storage is hard on low power devices).
I am using Kubernetes configmaps for the configuration so only the database.db and the coordinator_backup.json are blocking a completely stateless approach.
If these could be stored in an external database then the image can be run without a volume (stateless) which offloads the persistence to a NAS for example.

@Koenkk Koenkk added the enhancement New feature or request label Feb 3, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2021

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the stale label Mar 6, 2021
@wilmardo
Copy link
Author

wilmardo commented Mar 8, 2021

Not stale.

With something like Sequelize it would be possible to maintain the current SQLite implementation as backwards compatible option.

@Koenkk Koenkk removed the stale label Mar 8, 2021
@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@wilmardo
Copy link
Author

Still not stale though haha

@Koenkk Koenkk removed the stale label Apr 22, 2021
@Koenkk Koenkk reopened this Apr 22, 2021
@timdonovanuk
Copy link

+1 to this. But you mention a NAS - you could always mount the k8s container storage on a NFS shared drive. Or GlusterFS.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

@Toastyyy3
Copy link

I suppose there is still no support for external databases?

@judahrand
Copy link

@Koenkk I think this is still not stale? I'd be interested in having a little dig into this if there is might be any interest in accepting changes.

@Koenkk Koenkk reopened this Sep 25, 2024
@Koenkk
Copy link
Owner

Koenkk commented Sep 25, 2024

@judahrand changes are welcome, but note that it's not just the database.db and coordinator_backup.json, all files in the data folder should be persisted (also state.json)

@github-actions github-actions bot removed the stale label Sep 26, 2024
@judahrand
Copy link

Fantastic! I took an initial look at this last night and it (obviously) won't be a trivial change but I do think there could be significant value to using a 'real' database in terms of stability, extensibility (litestream, Postgres, etc.), and, I suspect, even performance.

I'm not a Typescript developer normally so I'm not familiar with the ORM landscape in Typescript but from my investigation last night I think that using Drizzle probably makes the most sense. It seems to aim to be as lightweight as possible seems to fit well for zigbee-herdsman. Initially, I suspect it would also be most straightforward to only support Sqlite. Other databases could be supported later if the desire was present.

There are a few design considerations to make too, I think. One that jumps to mind is how the 'backups' would work. For Coordinator backups this could probably just be versioned rows in a table and the newest row is always used by default or a given row could be marked as 'active'. A limited number of these backups could be kept at any given time and they could be timestamped.

When it comes to the backup logic when a reset is performed it becomes a little more difficult. For Sqlite initially the same logic can be used where we just copy the entire database file to a backed up version. However, this isn't really possible when using a remote database (and in this case one could argue that this sort of disaster recovery should be the database's responsibility not the application's). I wonder if in that case the functionality could be dropped?

I'm sure there are other decisions that would need to be made but from what I can tell, as long as the API for Controller is maintained and a migration path for the current files exists then other breaking changes should be acceptable. Is this correct?

@Koenkk
Copy link
Owner

Koenkk commented Sep 28, 2024

I'm not convinced that a SQL database will have any impact on performance, the whole database is loaded in memory and all devices are resolved through a very efficient lookup. A typical database is also nothing in terms of size, mine is 102KB (+- 80 devices).

An external database will also not solve the problem of needing persistent storage, z2m still stores configuration.yaml and state.json which needs to be writable.

I think a better approach here is to allow the Z2M data directory to be stored on a remote filesystem. If we use for example OpenDAL you could store the directory on almost any remote file system, e.g. via sftp or even OneDrive! 😄

@judahrand
Copy link

That's a fair point on performance. I hadn't initially realized that the whole database is loaded into memory. There are other advantages to using SQLite at least, however, imo.

An external database will also not solve the problem of needing persistent storage, z2m still stores configuration.yaml and state.json which needs to be writable.

Does the configuration.yaml need to be writable and persistent? That seems a surprising property for a config file which I'd expect to be entirely user controlled. That state.json could easily be migrated to live in the same SQLite database I think.

I've got a branch which implements a SQLite database which almost passes all the tests so I'll get that working and put it up at least for consideration.

@Koenkk
Copy link
Owner

Koenkk commented Sep 28, 2024

Does the configuration.yaml need to be writable and persistent?

Yes, e.g. the devices is updated whenever you pair a new device and for example when updating settings through the frontend.

@Nerivec
Copy link
Collaborator

Nerivec commented Sep 29, 2024

I think orm is much too complex for the 100 or so lines that Z2M needs. There is practically no use for the features since, like Koenkk said, everything is cached in memory on startup and read/parsed from there at runtime, not from the database. The tiny size allows this without creating RAM issues, and makes it a hard-to-beat option as far as speed is concerned.

As for writes, a quick bench on current vs drizzle orm branch shows a 2x drop in performance minimum. No doubt the overhead alone is a killer :/

Coordinator backups in the database is definitely a no-go for me. This is not portable.

@judahrand
Copy link

As for writes, a quick bench on current vs drizzle orm branch shows a 2x drop in performance minimum. No doubt the overhead alone is a killer :/

What benchmark are you running? I suspect it can be improved substantially - the current implementation is the simplest lift and shift possible.

Coordinator backups in the database is definitely a no-go for me. This is not portable.

I'm not sure I'm understanding you here. Why is it not portable? Easy to get back to the same JSON by pulling it out of a db table.

@Nerivec
Copy link
Collaborator

Nerivec commented Sep 29, 2024

What benchmark are you running? I suspect it can be improved substantially - the current implementation is the simplest lift and shift possible.

The usual, tinybench, with a save action (nothing else but a device.save(), I bypassed the rest of the logic).
I don't think it can be improved by much, since every runtime action is in-memory, not in-database, only saves actually touch the database at runtime, no querying, no filtering, etc.. The only thing the orm would actually end up being used for, is load everything in-memory on startup, then periodically write to fs (trying to do anything else versus the current in-RAM implementation would result in a further massive drop in performance). Hence the orm overhead being a killer against a plain fs read/write.

Easy to get back to the same JSON by pulling it out of a db table.

Not without dependencies and with extra parsing required. Current file is natively supported and follows the open backup format.
https://github.com/zigpy/open-coordinator-backup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants