Últimos Cambios |
||
Blog personal: El hilo del laberinto |
Última Actualización: 03 de mayo de 2007 - Jueves
This code provides a new storage engine for DURUS, an excellent persistence system for the Python programming language.
The README included in the distribution:
$Id: README 332 2007-02-20 20:23:49Z jcea $
WHAT IS "DURUS-BERKELEYDBSTORAGE"?
"Durus-berkeleydbstorage" is a backend storage module for Durus, a persistence system for Python. As its name indicates,"Durus-berkeleydbstorage" uses Berkeley DB as the storage technology.
Some advantages compared to Durus standard FileStorage:
Any object store in the storage will commit a durable transaction, including all objects released in the background garbage collector, along the way.
There are some disadvantages, nevertheless:
Failing to do that will leak diskspace. It is possible that in a future release we can collect cycles, but try to avoid that pattern.
Leaking objects will grow the diskspace, but **NO** corruption or malfunction will happen. No other secondary effect.
http://www.sleepycat.com/docs/ref/transapp/reclimit.html
http://www.sleepycat.com/docs/ref/transapp/archival.html
http://www.sleepycat.com/docs/utility/db_hotbackup.html
You can use this product both as a normal (local) filestorage, or a server (remote) storage system, just like the usual Durus FileStorage.
HOW IS "DURUS-BERKELEYDBSTORAGE" USED?
IMPORTANT: The PATH you specify in the storage MUST BE an already existant directory. The database files will be created inside.
You can use this engine in two ways:
The program is the only user of the storage, since local access is exclusive.
In your code simply put:
from berkeleydb_storage import BerkeleyDBStorage connection = Connection(BerkeleyDBStorage("PATH"))
where "PATH" is the path to already existant directory. The database will reside inside that directory.
After doing that, you use the connection like any other Durus connection. See "test3.py" as a reference implementation:
from durus.btree import BTree from durus.connection import Connection from berkeleydb_storage import BerkeleyDBStorage connection = Connection(BerkeleyDBStorage("db")) root=connection.get_root() root[0]=BTree() connection.commit() for i in xrange(65536) : root[0][i]=0 connection.commit() for i in xrange(0,65536,2) : del root[0][i] connection.commit() print len(root[0])
Clients are "normal" Durus clients.
The Durus server must use this engine. The file "server.py" is a reference implementation. Example:
import sys import socket from optparse import OptionParser from durus.storage_server import DEFAULT_PORT, DEFAULT_HOST, StorageServer from durus.logger import log, logger, direct_output def start() : logfile = sys.stderr direct_output(logfile) from berkeleydb_storage import BerkeleyDBStorage storage = BerkeleyDBStorage("db") host=DEFAULT_HOST port=DEFAULT_PORT log(20, 'host=%s port=%s', host, port) StorageServer(storage, host=host, port=port).serve() if __name__=="__main__" : start()
Additionally, you can specify options when you instanciate a Berkeley DB Storage. The options are passed as optional parameters in the instance constructor:
The directory path where the storage resides. This directory MUST exist before trying to create a new storage.
This parameter is MANDATORY.
When creating a new storage database, we can choose between "hash" or "btree" schema.
Performance differences are noticeable only when the database size is in the multigigabyte/terabyte range, and depend also of access patterns in the application (for example, reference locality).
Remember also that client caching will bias considerably the access patterns that hit the disk, so your best bet would be to test both configurations in your environment under your normal usage.
When the database already exists, this parameter has no effect.
Berkeley DB can use a cache MMAPed file to improve performance. You don't usually touch this value.
You can see the hit ratio of the cache with the "db_stat" Berkeley DB tool.
This value is only used when creating the storage first time, or after a "db_recover". In other cases, just ignored.
Berkeley DB uses a log buffer to store in-progress transaction data. If your buffer size is big, you waste RAM. If the size is too low, Berkeley DB will need to flush it frequently, affecting performance.
So you can tune this parameter according to your transaction (write) size. You can see if the buffer is too small using the "db_stat" Berkeley DB tool.
This value is only used when creating the storage first time, or after a "db_recover". In other cases, just ignored.
Using this parameter, you can request a "db_recover" operation explicitly. Remember, nevertheless, that this Storage backend could decide to do an implicit "db_recover" if it thinks that it is neccesary.
"do_recover" constructor parameter default value changes according to "read_only" parameter. If the storage is opened read/write, "do_recover" default value will be True. If the storage is opened "read only", the "do_recover" default value will be False.
You can't do a database recovery if you opened it as "read only".
This Storage backend complies with ACID semantics (Atomic, Consistent, Isolated and Durable). The Durable part incurs in a performance penalty because it requires a syncronous disk write.
In some environments, it may be desirable to trade off durability vs. write performance. You risk, nevertheless, losing your last committed transactions. In some environments that might be acceptable.
You lose committed transactions if your machine crashes or it is rebooted without warning, or if the storage application crashes in some critical functions. You can also lose latest committed transactions if you do a database "recover", or it is done internally because the storage feels it is necessary.
In any case, you are garanteed to keep always ACI semantic. That is, any transaction will be applied entirely or no changes at all, no data corruption, etc. Also, no transaction X will be visible if transaction X-1 was "lost" because durability was disabled, so your data will be chronologically correct.
If the storage instance has durable=True, this flag is ignored.
If durable=False, this flag allow:
Remember that this flag is only considered if you explicitelly demanded a "non durable" storage. So you get what you asked for.
Open the Storage database in "read only" mode. The Storage must be initialized previously, and should have some data on it.
In this mode, write attempts raise an exception.
In this mode, several applications can share the storage simultaneously.
This parameter defines the policy used to do database checkpointing.
The default value depends of "read_only" flags:
If True: The default value is "berkeleydb_storage.checkpoint_noop". This policy does nothing. It is a "no operation".
This parameter defines the policy used to do garbage collection.
The default value depends of "read_only" flags:
Opening the storage with this flag set can be a fairly slow operation, with time proportional to the number of objects in the storage. This option should be done only when upgrading this storage and the previous version had any issue related to garbage collection.
Remember also that this option can takes a lot of memory, proportional to the number of objects in the storage.
If set, this parameter instructs the storage to do, at init time, a reachability analysis to determine garbage consistency and possible leaks. We print the number of previously unknown unreachable objects.
Reference cycles are not detected, nor collected.
This parameter use reduces the "to be collected" objects to the a minimum garbage root set. From there, the garbage collector should be able to detect and collect all garbage not involved in reference cycles.
If the storage was opened in "read only" mode, no change will be actually done to the storage.
NOTE: It is usual to find new garbage when initiating a storage several times with this parameter set. This is normal, and do not indicate any kind of problem. The usual garbage collection process would discover those unreachable objects automatically. In general, only use this parameter once, when upgrading this storage engine and previous version had any kind of problem with garbage collection.
CHECKPOINT POLICY
Releases since 20061016 allow to specify an object in the storage constructor to set a database checkpointing policy.
Programmers can create new arbitrary policy objects/factories using "berkeleydb_storage.checkpoint_interface" interface-like class as a blueprint. Any future interface change will be documented in the UPGRADING document.
Currently implemented policies are:
This is the default policy.
Since this policy does DB checkpointing while the storage is doing additional transactions, we could have temporary more database logging files in the environment that necessary. This is a temporal issue, automatically resolved.
So, this policy slowdowns storage shutdowns but the storage initializacion will be faster if we do a database recover.
GARBAGE COLLECTION POLICY
Releases since 20070220 allow to specify an object in the storage constructor to set a garbage collection policy.
Currently, the interface used as a blueprint is subject to change. DO NOT IMPLEMENT new policy objects. They can break without notice when upgrading this storage backend. When the API be stable enough to freely implement new policies, you will be notified.
Currently implemented policies are:
This is the default policy.
DATASET MIGRATION
If you already have data in a traditional Durus FileStorage, you can migrate it to this Storage Backend using the standard migration procedure: iterate over the objects in the source storage, send them to the destination storage and do a "commit" when you are done.
This procedure is useable if your dataset fits in RAM, but even if your machine is full of RAM or you have a huge SWAP space, you will hit addressable space limitations, if your machine is 32 bits, in the order of 2^30 bytes.
You can't do "bit sized" commits because this backend does garbage collection in background, and it could free stored but still not referenced objects.
To solve these issues, releases since 20060509 have a "migrate()" method. It has an iterable parameter, giving the original objects.
Example:
from durus.file_storage import FileStorage from berkeleydb_storage import BerkeleyDBStorage source=FileStorage("source",readonly=True) destination=BerkeleyDBStorage("destination") destination.migrate(source.gen_oid_record())
DOWNLOADS:
"durus-berkeleydbstorage" is released under GNU Public License, version 2.
This release is known to work with Berkeley DB releases 4.5.*, 4.4.* and 4.3.*.
Since this point, this Storage Engine requires Durus 3.7 and up.
This release is known to work with Berkeley DB releases 4.5.*, 4.4.* and 4.3.*.
The performance improvement is huge when deleting "cold" objects (objects not in the DB cache), specially if there are many inter-object references.
If you like or need (for example, your python deployment doesn't support multithreading) the old behaviour (pure inline garbage collection), you can activate it passing "berkeleydb_storage.garbage_collection_inline" to the storage constructor.
"UPGRADING" and "README" documentation updates.
If a storage is opened in read only mode, just ignore the "pack" attempts.
Some regression bugfixes for "read only" mode combined con current garbage collection policy objects.
The lock file shouldn't have +x mode set.
When doing a full and explicit "pack", and the checkpoint is done in background, we could get a lot of DB "log" files, since we could generate transactions too fast for the cleanup thread.
If using the prefetch garbage collection policy, a full and explicit "pack" burned CPU for no purpose.
Solved.
Patch to avoid (temporal) GC starvation if we get new garbage before finishing the GC prefetch.
Also, this patch avoids multiple prefetch of the same object when new garbage arrives.
Finally, this simple patch should solve an "assert()" failure in the prefetch thread. Hope so :-).
Update the "TODO" document.
First implementation of a garbage collection policy object doing garbage collection inline, but with background prefetch.
Solve a potential corruption if upgrading Storage backend from version 0 to version 2 directly.
Avoid to raise spurious exceptions with policy objects, if we can't instantiate correctly the storage. For example, because the storage is already locked.
Compatibility check-in in the checkpoint policy objects for the old Berkeley DB 4.3.
Better traceback if the checkpoint thread dies.
First implementation of the locking protocol in the garbage collection policy interface.
First time you use this release to open a storage database created by previous releases, it will be transparently "upgraded" to current format, so:
This checkpoint policy does a forced checkpoint when closing the storage. This would slowdown storage shutdown, but speed up storage initializacion.
Shy first implementation of "sync" feedback feature of Durus 3.6.
Since this point, this Storage Engine requires Durus 3.6 and up.
Full implementations of "garbage_collection_noop" and "garbage_collection_inline" policy objects.
A late compatibility fix for Durus 3.6.
This fix requires a (instantaneous) storage upgrade.
Initial support for a garbage collection policy objects.
A new checkpoint policy object: "berkeleydb_storage.checkpoint_thread_final_checkpoint".
More gentle database closing if the program closes the storage handle and then dies without giving an oportunity to the garbage collector.
The storage did a database recover even if asked to not do it.
Do some minor changes for compatibility with just released Durus 3.6.
"KNOW_HOW-DURUS" updated to Durus 3.6.
If you like or need (for example, you python deployment doesn't support multithreading) the old behaviour (inline checkpointing), you can activate it passing "berkeleydb_storage.checkpoint_inline" to the storage constructor.
Of course, you can overwrite these defaults if you wish and you know what are you doing.
Document "KNOW_HOW-DURUS" updated to Durus 3.5.
UPGRADING and README documentation updates.
If the checkpointing thread dies, notify the calling thread.
Be able to do the database checkpointing in the background.
"do_recover" constructor parameter default value changes according to "read_only" parameter. If the storage is opened read/write, "do_recover" default value will be True. If the storage is opened "read only", the "do_recover" default value will be False.
NOTE: Although documentation is not yet updated, this release works just fine under Durus 3.5 release.
If you are opening the storage "read_only", you must set this flag to False.
If you are using "non durable" storages, and you want to keep the previous behaviour, you MUST use "async_log_write=False". The new default will use asynchronous writes for the transactional logging.
"do_recover" constructor parameter default value changed from "False" to "True".
"KNOW_HOW-DURUS" updated to document a sort of persistent weak reference pattern.
"KNOW_HOW-DURUS" updated to document BTree abilities.
New "in use" flag inside the database storage. If a storage is opened and that flag is set, a database recovery is done.
That flag is cleared when the storage instance destructor is called.
This flag is not used if the database is opened in read only mode.
When the storage is opened read/write and non durable, the storage instance destructor will try to (synchronously) flush the transaction log.
This last flush can't be garanteed, nevertheless.
Add a new optional parameter to the constructor: "async_log_write".
Storage databases created with this release are not compatible with previous releases.
First time you use this release to open a storage database created by previous releases, it will be transparently "upgraded" to current format, so:
Also, in previous backend releases there was a bug in the garbage collection code that would skip over "to be deleted" objects, leaving some garbage behind.
So with this release you have three options:
It takes RAM proportional to object count in your storage, so beware if you have a huge database.
Of course you only need to pass this parameter once, to catch garbage leaked by previous releases.
Document the upgrade process.
"get_size()" is very costly. With current implementation, the storage must access all the database. I can implement a manual counter or migrate to btree and use DB_RECNUM:
http://www.sleepycat.com/docs/api_c/db_stat.html
http://www.sleepycat.com/docs/api_c/db_set_flags.html#DB_RECNUM
Finally we implement a manually managed counter. Now "get_size()" is instantaneous, but it require an storage upgrade. So, you can't use previous backend versions.
When creating a new storage database, be able to choose between "btree" and "hash" schema.
Document "KNOW_HOW-DURUS" updated to Durus 3.4.
We add a new "validate_garbage" optional parameter to the storage constructor. If that parameter is True, the storage will do a garbage check. Read the documentation in the README file for details.
If the storage was stopped before garbage collection was completed, the storage could leak some unreachable objects.
Solving this issue, we make sure also that the garbage collection makes progress even if the storage is stopped repeatly before garbage collection completion. That is, we store partial state to avoid to try to clean already collected objects.
Since some foreign code seems to depent on it, I implement a "pack()" method in the storage.
Implements a "migrate()" method to migrate huge storages without hitting memory in a hard way. With this method you only need memory for an object, not the entire repository in RAM. So, this is imprescindible if your address space (in 32 bits) is small compared with your storage size.
File "KNOW_HOW-DURUS" contains a lot of info about Durus internals and advanced usage.
Solved an object leak when you commit a transaction having unreferenced objects. That is, you commit objects that are ALREADY garbage.
You usually don't do that, but it can be a very helpful pattern to break object cycles when deleting objects.
You could hit this bug if you keep around references to persistent objects between transactions, a big NO NO. You could "resurrect" a deleted object, and that object, and the object graph from there, would became inmortal.
"gen_oid_record()" implementation. This implementation has the same limitations that the standard Durus one. In particular, you shouldn't commit transactions while iterating over the storage, and this method can return already deleted objects, not yet collected. These "issues" are already present in the stock Durus implementation.
The usual use of this method is migrating class names of already stored objects.
"Read Only" mode implemented.
"D" in ACID is optional, if allowable, to improve performance.
Hint from Thomas Guettler.
There is a race condition between a client deleting an object and other client trying to fetch it. Previous versions of the code would kill the server with an assertion.
Current code will signal the issue to the caller. If we are using the server storage, the client will receive a "ReadConflictError" exception, just like stock Durus, if you do a garbage collection between the object deletion in a client and the object retrieval in the other one.
When instancing a BerkeleyDBStorage, we run Berkeley DB "DB_RECOVER" if:
Improved checkpointing, especially when forcing a full collection. But remember that you don't need to do a full collection unless you REALLY need it. The backend automatically does garbage collection in background. With this change we have a 10% speed penalty, more or less, when doing a full collection.
Very improved database logging files removal. Now you can expect a maximum of three logging files (30 Mbytes) in the storage.
When we create a new Berkeley DB storage, first time, we can configure log buffer size and cache size using constructor parameters. If the storage is already created, those parameters are ignored.
When doing a commit, instead of loading the reference counting of all referenced objects, do the incremental adjust without database access, all in RAM, and then access and modify ONLY the objects with changed reference counters.
The code is simpler, faster and eats less RAM, when you update heavily linked objects.
Cope with different combinations of bssdb/bssdb3 instalations and missing instalations.
First public release. Production ready.
Esto es solo útil para mí :-).
Más información sobre los OpenBadges
Donación BitCoin: 19niBN42ac2pqDQFx6GJZxry2JQSFvwAfS