As previously noted the web can be regarded as an endless source of open data, some of it structured and documented but most of it unregulated and uploaded to various platforms with little thought to how it might be usefully re-used by the global research community. Finding open data is easy; finding open data that suits your research needs or is able to reproductively verify the findings of a research publication will probably require additional input on your part.
Given the daunting amount of data resources available the best place to start looking for suitable data is among peers, your subject area community and social media hubs. In many cases, a simple Google search will identify many of the Open Data resources you are looking for. Reliable sources of data will always be reported and shared, trustworthy data will usually be hosted by a reputable institution, have sufficient documentation to understand the data, have a licence attached and provide a Digital Object Identifier (DOI) for citation and attribution.
You may find the following sources helpful as a starting point:
Some research funding bodies require data to be deposited in specific discipline-based repositories. Examples include:
It is highly unlikely that an open dataset exists which exactly matches your research requirements, in practice open data generally serves two main functions and both require addtional input in order to be useable:
Trust is mainly based on the provenance of data, so the quality of the documentation or metadata which accompanies open data is of paramount importance. Initiatives such as FAIR data, the CoreTrustSeal of certified repositories, and some publishers are gradually building a framework where open data can be trusted, verified and reused in academic research.