get_data function

Load a standard dataset, while supporting overriding by the user.