[SOLUCIONADO] Duda con beautifulsoup

alan.caedus · Junio 30, 2020, 12:17:19 AM

Hola gente! Estuve practicando web scraping con el framework beautifulsoup de Python en esta página: You are not allowed to view links. You are not allowed to view links. Register or Login or You are not allowed to view links. Register or Login

Mi idea es obtener los links de todas las ofertas pero por algún motivo el script solamente obtiene el enlace de la primer oferta. Agradecería cualquier idea/consejo, saludos!

Código: python


from requests import get
from bs4 import BeautifulSoup

url = 'http://www.python.org.ar/trabajo/'
links_ofertas = []
respuesta = get(url)
soup = BeautifulSoup(respuesta.text, 'html.parser')
ofertas = soup.findAll('div', class_ = 'content-wrapper')

for oferta in ofertas:
    oferta.find('div', class_ = 'col-md-12')
    link = oferta.h4.a
    if(link.has_attr('href')):
        links_ofertas.append(link['href'])

DtxdF · Junio 30, 2020, 01:20:31 AM

Esto podría funcionar:

Código: python

import requests
import bs4
import re
 
request = requests.get('http://www.python.org.ar/trabajo/')
BeautifulSoup = bs4.BeautifulSoup(request.text, 'html.parser')
body = BeautifulSoup.body
 
for h4 in body.findAll('h4'):
    url = h4.a
 
    if (re.match(r'/trabajo/', url.get('href'))):
        print('URL:', url.get('href'))

El resultado podría ser algo como:

Código: text

URL: /trabajo/devops-engineer-6/
URL: /trabajo/automation-engineer/
URL: /trabajo/senior-javascript-developer-2/
URL: /trabajo/lider-tecnico-desarrollador-backend-django-sql-aws/
URL: /trabajo/senior-python-backend/
URL: /trabajo/back-end-software-engineer/
URL: /trabajo/desarrollador-python-32/
URL: /trabajo/sr-python-dev-con-django/
URL: /trabajo/python-developer-team-leader/
URL: /trabajo/sr-dev-pythonreact-pref-arquitectura-y-nuevos-desa/
URL: /trabajo/python-dev-senior-remoto/
URL: /trabajo/100-remote-python-developer-us-client/
URL: /trabajo/ingeniero-de-requerimientos-sr/
URL: /trabajo/python-developer-53/
URL: /trabajo/ssr-qa-automation-engineer-python-80-remoto/
URL: /trabajo/desarrollador-python-31/
URL: /trabajo/senior-python-developer-100-remoto/
URL: /trabajo/data-engineer-3/
URL: /trabajo/data-analyst-con-tableau-power-bi-qlik-o-google-an/
URL: /trabajo/devops-bash-python-ruby/
URL: /trabajo/senior-python-developer-9/
URL: /trabajo/full-stack-dev-o-front-end-dev/
URL: /trabajo/buscamos-freelance-developer-para-finalizacion-y-m/

~ DtxdF

DtxdF · Junio 30, 2020, 04:40:58 PM

@You are not allowed to view links. You are not allowed to view links. Register or Login or You are not allowed to view links. Register or Login.caedus

Acabo de modificar el código porque cuando lo probé descargué el código HTML usando el mismo navegador e hice el scraping de forma local, por lo que el código que escribí no mostraría nada, además que el código lo volví a escribir aquí en la misma entrada para comentar tu post, por lo que podría generar el típico "TabError". Mil disculpas, ya está solucionado

~ DtxdF

alan.caedus · Julio 01, 2020, 06:21:56 AM

Muchas gracias! Si, justo te iba a comentar eso jaja. Nuevamente, gracias por la ayuda!

[SOLUCIONADO] Duda con beautifulsoup

alan.caedus

Junio 30, 2020, 12:17:19 AM Ultima modificación: Julio 01, 2020, 07:24:26 AM por DtxdF

DtxdF

Junio 30, 2020, 01:20:31 AM #1 Ultima modificación: Junio 30, 2020, 04:36:23 PM por DtxdF

DtxdF

Junio 30, 2020, 04:40:58 PM #2

alan.caedus

Julio 01, 2020, 06:21:56 AM #3