Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emysql_conn can kill its caller #145

Open
emschwar opened this issue Sep 10, 2014 · 4 comments
Open

emysql_conn can kill its caller #145

emschwar opened this issue Sep 10, 2014 · 4 comments

Comments

@emschwar
Copy link

First, let me say I'm relatively new to Erlang, and perhaps I'm Doing It Wrong™, in which case I'd appreciate pointers to appropriate enlightenment. That said:

I'm operating in an environment where apparently there are network outages between my emysql client (running inside an ejabberd module) and the MySQL server it's connecting to. When this happens, any process that tries to access emysql while this is happening is killed with:

=SUPERVISOR REPORT==== 4-Sep-2014::22:00:09 ===
     Supervisor: {local,ejabberd_c2s_sup}
     Context:    child_terminated
     Reason:     {connection_down,
                     {and_conn_reset_failed,
                         {cannot_reopen_in_reset,
                             {failed_to_recv_packet_header,timeout}}}}
     Offender:   [{pid,<0.22426.155>},

This seems to be coming from emysql:execute calling emysql:monitor_work which in turn calls emysql_conn:test_connection, which exits() upon being unable to acquire a connection. Again, maybe I'm doing it wrong, but I'd expect emysql:execute() to either return something like, {error, connection_timeout} in this context, or perhaps throw an exception of some kind.

@jlouis
Copy link
Collaborator

jlouis commented Sep 11, 2014

This driver is not particularly well-behaving in many situations. This error is due to the client sending a PING command to the server on an established connection, but not getting a response within the default timeout range (5 seconds, I believe). It doesn't even get a header here!

The problem is that emysql:execute should have a timeout which is for the statement which is executed. But there is a timeout on the socket for communication as well, which is a bad idea.

In other words, this driver is bad, and we know it. But most other drivers are worse off.

Chances are, however that something is amiss in your setup. It seems odd I can create a conn and then not get a ping in 5 seconds, the default timeout. So I would definitely check connectivity at a lower level first.

@emschwar
Copy link
Author

Oh, I definitely have connectivity problems at a low level! That is, sadly, not in question. I'm working with my sysadmins to figure out why, but I'm also hoping to discover a way that, until I can solve the underlying connectivity issue, I can at least prevent the code from unexpectedly exit()ing ejabberd's c2s client.

I apologize if it seemed I was criticizing; it's more a case of me trying to ensure that my expectations are in fact valid, and not the product of my inexperience. Is there a pattern or technique that I can use here to ensure that at the very least, my calling process won't exit() in this case? Things are complicated somewhat in that I'm running in the context of somebody else's process (an ejabberd c2s client), so it's not so easy to just process_flag(trap_exit, true), since those messages could go to the c2s process instead of my code.

@emschwar
Copy link
Author

FWIW, further investigation reveals that what happens is that there is an already-existing connection that was created just fine, but at some point the network connection between MySQL and emysql disappears. At that point, we

  1. check out the connection
  2. try to ping it
  3. fail, and try a new connection
  4. this fails immediately, and
  5. emysql exit()s

I think I can work around this by wrapping all my emysql:execute() calls in

case catch emysql:execute(...) of
  {'EXIT', Reason} -> retry_this(...)
  Result -> Result
end

but, and please correct me if my impression is wrong, this doesn't seem like the best solution.

@jlouis
Copy link
Collaborator

jlouis commented Sep 11, 2014

Well, the problem is that the interface between a user of the emysql library and the worker pool of emysql is not clear. So you end up with all these nasty consequences when you consider what is allowed and what is not.

The problem in the emysql maintainer end is that while I can solve some of these problems, the lack of a clear specification makes it really hard to actually resolve any problem. Building a testable spec, with QuickCheck is definitely something that needs done, but since I hate MySQL with a passion, I am not really that interested in doing it, unless people throw money at me for doing so :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants