Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on power failure mongodb is corrupt #879

Open
tim-moody opened this issue Feb 3, 2017 · 8 comments
Open

on power failure mongodb is corrupt #879

tim-moody opened this issue Feb 3, 2017 · 8 comments

Comments

@tim-moody
Copy link
Contributor

tim-moody commented Feb 3, 2017

on rpi3 after losing power, when running both via console and with runansible

TASK [mongodb : enable services] ***********************************************
failed: [127.0.0.1] (item={u'name': u'mongodb'}) => {"failed": true, "item": {"name": "mongodb"}, "msg": "Unable to restart service mongodb: Job for mongodb.service failed. See 'systemctl status mongodb.service' and 'journalctl -xn' for details.\n"}

@tim-moody
Copy link
Contributor Author

after mongod --dbpath /library/dbdata/mongodb --repair this problem remained, though perhaps because I ran it as root.

rm -rf /library/dbdata/mongodb allowed it to run, but of course all data would have been lost

@holta
Copy link

holta commented Apr 14, 2017

Update from Apr 13 call @ http://tinyurl.com/iiabminutes and @floydianslips facing the exact same issue in Raspbian Lite today, just like I faced it within Raspbian Pixel in earlier days.

@floydianslips had not run http://box/sugarizer and likely did 1 "hard reset" by accident
@holta observed the system had frozen solid twice during the prior week (but it's behaving well for days on end now...)

"[What] is destroying MongoDB regularly? e.g. on Holt’s RPi3-Raspbian-128GB-install-test (runansible fails, after selecting “Check to Enable WordPress”). Despite being powered off properly (machine froze 2 times, in the 10 days since built on a brand new SD). Clarif: nothing to do with WordPress #902, but corrupt MongoDB prevented ansible from completing
i. https://forums.meteor.com/t/why-mongodb-is-unreliable/5370
ii. Workaround so far: “rm -rf /library/dbdata/mongodb” presumably blows away Sugarizer history :( [but at least ./runansible works after this!]
iii. @georgejhunt : turn on Journaling in future to aide repair of MongoDB’s ?"

@holta
Copy link

holta commented Apr 15, 2017

Unscientific Speculation (below) if @llaske has a moment to examine bug reports above:

"Something doesn't quite add up, as MongoDB wasn't jamming up Internet-in-a-Box/XSCE's ansible runs earlier this year in 2017, despite inevitable/intermittent power failures. And yet now it's happening fequently. Is there any possibility Sugarizer 0.8's MongoDB is more fragile than Sugarizer 0.7's, or that a new version of MongoDB is much more frail?"

@llaske
Copy link

llaske commented Apr 15, 2017

There was no update on the MongoDB part between Sugarizer 0.7 and Sugarizer 0.8. So I don't think that the update could be the cause of the issue.
I'm not an expert on MongoDB but may be stopping it brutally could cause it to fail on next restart.
I guess that a work around could be to launch a - preventive - repair command at each server start ?

@holta
Copy link

holta commented Apr 15, 2017

Very sadly a MongoDB repair does not work, as @tim-moody and I have both tried that.

Thx @llaske for checking in~ we'll have to write this up as a Known Issue regrettably, plz plz help us monitor the situation in 2017, in case progress appears later!

@llaske
Copy link

llaske commented Apr 15, 2017

:-(
Another work around is to don't start MongoDB and to don't start sugarizer.js nodejs script.
Without its backend, Sugarizer will work in a "limited/degraded mode".
In this mode, activities could be launch and will run as usual but neither presence (multiple users playing to activities) and collaboration (shared journal) will be available.
It's probably acceptable for deployments where using Sugarizer is not the main objective.
I can't guaranty however than this limited mode will continue to work on future version.

@holta
Copy link

holta commented Apr 15, 2017

Excellent advice @llaske !

@tim-moody has done most all the research here, but I'll ask @georgejhunt & @floydianslips to look into this too (as offline world / developing world stability is critical, when disconnected for ~2 years at a time...)

@tim-moody
Copy link
Contributor Author

tim-moody commented Apr 15, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants