Swimming microorganisms and artificial micro-swimmers can take advantage of environmental stimuli to bias their motility patterns in order to achieve some biologically relevant goal or some engineered task in complex environments. We consider a model for ‘smart gravitactic swimmers’. These are active particles suspended in a complex flow whose task is to reach the highest altitude within some time horizon, given the constraints enforced by fluid mechanics. Sensing partial information about the surrounding flow regions it visits, the smart gravitactic swimmer needs to adapt its swimming behaviour in an appropriate way in order to ultimately maximize its ascent. We use Reinforcement Learning as a framework to develop efficient swimming strategies that allows smart gravitactic swimmers to accomplish their long-term task.