# Solved – Poisson regression with offset vs logistic regression

I am thinking to use Poisson regression with an offset variable instead of a logistic regression in case where event is rare, since p (probability of success) is very small and n (sample size is large). Which would be the differences between these two approaches? Where can I found some literature about it?

Contents

I might in for a real learning treat here, but it seems to me that you're trying to model a problem using two very different distributions.

Poisson distributed output is integer, positive and unbounded in a sense. Logistic regressions is intended for binary outcomes ie binomial data. The output looks the same at a quick glance, but you have to consider whether you can reasonably define a measure of how many trials you're conducting and assign a probability of success to every trial, in which case you have a binomial distribution.

Consider two examples: 1) model the survival probability of passengers on the Titanic: Binomial. You know the number of passengers in every class, ie the number of distinct trials, and you know how many survived.

2) Model the number of ear infections per year among different kinds of swimmers: Poisson with offset. You DO know the number of swimmers in every group, this is the offset in the Poisson distribution, but you can't reasonably ask how many times you've tested whether a swimmer caught an ear infection or not. You can only summarize once your chosen time interval is up.

It seems to me that you should clarify what kind of output you're looking at, and after reasoning about what you could expect from that output, decide on the correct family of distributions to model from.

If this does not point you in the right direction then I'm very eager to learn some new statistics tricks.

edit: Literature recommendations seems to be anything related to generalized linear models (not to be confused with general linear models).

Rate this post