We use model-free reinforcement learning (RL) to investigate how a mortgage servicer can optimize her actions towards a borrower. Our methodology differs from the conventional heuristic approach, since we do not use subjective and qualitative judgments of industry and legal experts. We are the first to exploit the borrower’s soft information post-securitization and her responsiveness to the servicer, to estimate an RL-policy rule. When maximizing her reward, the servicer learns the borrower’s type dynamically. By doing so, the servicer can preempt the borrower’s adversarial behavior, thereby increasing the borrower’s cooperation.