I'm trying to crawl all the repository names found in the docker hub via this link: https://hub.docker.com/search/?q=*&page=1&isAutomated=0&isOfficial=1&pullCount=0&starCount=0
The HTML tag I'm interested in is:
<div class="RepositoryListItem__repoName___3iIWs" data-reactid=".s0zyncta0w.126.96.36.199.0.$4lexnz/overtime.0.0.1.0">4lexnz/overtime</div>
where the data-reactid is always different for each repository.
I'm using Bash and would like to grep the text between the div tag for each div that contains class="RepositoryListItem__repoName___3iIWs". Can someone please help me construct a regexp and command chain to do that in bash?
So far I have:
content=$(curl -L 'https://hub.docker.com/search/?q=*&page=1&isAutomated=0&isOfficial=0&pullCount=0&starCount=0') echo $content | grep -oP '(?<=<div class="RepositoryListItem__repoName___3iIWs").*?(?= </div>)'
but this doesnt return anything at all. The value of $content is correct so it's the last grep that's not doing what I want. Can someone help please? Thank you!