We discuss a novel class of swarm-based gradient descent (SBGD) methods for non-convex optimization. The swarm consists of agents, each is identified with position, x, and mass, m.
There are two key ingredients in the SBGD dynamics: (i) persistent transition of mass from agents at high to lower ground; and (ii) time stepping protocol which decreases with m.
The interplay between positions and masses leads to dynamic distinction between "leaders" and "explorers": heavier agents lead the swarm near local minima with small time steps;
lighter agents use larger time steps to explore the landscape in search of improved global minimum, by reducing the overall "loss" of the swarm.
Convergence analysis and numerical simulations demonstrate the effectiveness of SBGD method as a global optimizer.