From 35b52c70534cb7193b218ec12efe6bc595312097 Mon Sep 17 00:00:00 2001 From: Tina Yang Date: Thu, 1 Apr 2010 14:09:00 -0700 Subject: [PATCH] RDS: Fix corrupted rds_mrs On second look at this bug (OFED #2002), it seems that the collision is not with the retransmission queue (packet acked by the peer), but with the local send completion. A theoretical sequence of events (from time t0 to t3) is thought to be as follows, Thread #1 t0: sock_release rds_release rds_send_drop_to /* wait on send completion */ t2: rds_rdma_drop_keys() /* destroy & free all mrs */ Thread #2 t1: rds_ib_send_cq_comp_handler rds_ib_send_unmap_rm rds_message_unmapped /* wake up #1 @ t0 */ t3: rds_message_put rds_message_purge rds_mr_put /* memory corruption detected */ The problem with the rds_rdma_drop_keys() is it could remove a mr's refcount more than its due (i.e. repeatedly as long as it still remains in the tree (mr->r_refcount > 0)). Theoretically it should remove only one reference - reference by the tree. /* Release any MRs associated with this socket */ while ((node = rb_first(&rs->rs_rdma_keys))) { mr = container_of(node, struct rds_mr, r_rb_node); if (mr->r_trans == rs->rs_transport) mr->r_invalidate = 0; rds_mr_put(mr); } I think the correct way of doing it is to remove the mr from the tree and rds_destroy_mr it first, then a rds_mr_put() to decrement its reference count by one. Whichever thread holds the last reference will free the mr via rds_mr_put(). Signed-off-by: Tina Yang Signed-off-by: Andy Grover --- net/rds/rdma.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net/rds/rdma.c b/net/rds/rdma.c index 3b442d4d64cf..463b458ff27e 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -130,14 +130,22 @@ void rds_rdma_drop_keys(struct rds_sock *rs) { struct rds_mr *mr; struct rb_node *node; + unsigned long flags; /* Release any MRs associated with this socket */ + spin_lock_irqsave(&rs->rs_rdma_lock, flags); while ((node = rb_first(&rs->rs_rdma_keys))) { mr = container_of(node, struct rds_mr, r_rb_node); if (mr->r_trans == rs->rs_transport) mr->r_invalidate = 0; + rb_erase(&mr->r_rb_node, &rs->rs_rdma_keys); + RB_CLEAR_NODE(&mr->r_rb_node); + spin_unlock_irqrestore(&rs->rs_rdma_lock, flags); + rds_destroy_mr(mr); rds_mr_put(mr); + spin_lock_irqsave(&rs->rs_rdma_lock, flags); } + spin_unlock_irqrestore(&rs->rs_rdma_lock, flags); if (rs->rs_transport && rs->rs_transport->flush_mrs) rs->rs_transport->flush_mrs(); -- 2.39.5