Merged PR 25889: Fixes bad memory access problem in hashing

Fix bad memory access problem in hashing by using the graph allocator
This commit is contained in:
Marcin Junczys-Dowmunt 2022-09-29 19:01:49 +00:00
parent 2cd3055d76
commit 2c55cdb3c0
3 changed files with 4 additions and 2 deletions

View File

@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- `--output-sampling` now works with ensembles (requires proper normalization via e.g `--weights 0.5 0.5`)
### Fixed
- Use allocator in hashing
- Read/restore checkpoints from main process only when training with MPI
- Multi-loss casts type to first loss-type before accumulation (aborted before due to missing cast)
- Throw `ShapeSizeException` if total expanded shape size exceeds numeric capacity of the maximum int value (2^31-1)

View File

@ -1 +1 @@
v1.11.11
v1.11.12

View File

@ -99,7 +99,8 @@ void GraphGroup::syncParametersAndShards() {
// compute hash value of parameters of 0-th graph (we only need to check one graph per node)
for(int i = 0; i < hashes.size(); i++) {
if(i == mpi_->myMPIRank()) {
hashes[i] = graphs_[0]->params()->vals()->hash(); // this is quite fast with on-GPU implementation
auto allocator = graphs_[0]->allocator();
hashes[i] = graphs_[0]->params()->vals()->hash(1234, allocator); // this is quite fast with on-GPU implementation
LOG(debug, "Parameter hash for graph 0 on node {}: {}", mpi_->myMPIRank(), hashes[i]);
}
}