Bookmark Manager

Idea

  • Build a simple service to power the References on articles
  • The frontend sends a link
  • The service makes a request to the link and returns the following:
    • Title
    • SEO image
    • Short description

Thought process

I'll be trying a different approach this time around. Rather than spending time doing a lot of research, I'll come up with a quick solution first, then research on areas improvements. Here's the quick solution:

  • Make a request to the specified endpoint
  • Check for 2xx status code
  • Parse HTML document
  • Return the parsed content to the client

Implementation

I need an endpoint to make a request to the specified URL and return the parsed content. For start, I need two functions:

  • FetchPageDetails(): An HTTP handler
  • ParseHTML(): An internal function that processes the result of the HTTP request

Project setup

Terminal window
mkdir bookmark-manager && cd bookmark-manager
go mod init github.com/odujokod/bookmark-manager
  • Create a main.go file in the root directory
  • Create bookmark_test.go and bookmark.go as well

Fetching the HTML

Using TDD,

bookmark_test.go
package bookmark
import (
"fmt"
"net/http"
"net/http/httptest"
"testing"
)
func TestFetchPage(t *testing.T) {
externalURL := "https://google.com"
path := "/fetch"
url := fmt.Sprintf("%s?url=%s", path, externalURL)
req, _ := http.NewRequest(http.MethodGet, url, nil)
res := httptest.NewRecorder()
FetchPageDetails(res, req)
expected := 200
got := res.Result().StatusCode
if got != expected {
t.Errorf("Expected: %d, got %d\n", expected, got)
}
}
bookmark.go
package bookmark
import (
"fmt"
"io"
"net/http"
)
func FetchPageDetails(w http.ResponseWriter, r *http.Request) {
url := r.FormValue("url")
if url == "" {
http.Error(w, "URL is required", http.StatusBadRequest)
}
res, err := http.Get(url)
if err != nil {
http.Error(w, "Unbale to fetch page", http.StatusInternalServerError)
}
body, err := io.ReadAll(res.Body)
defer res.Body.Close()
if err != nil {
http.Error(w, "Unable to read page content", http.StatusInternalServerError)
}
if res.StatusCode > 299 {
http.Error(w, res.Status, res.StatusCode)
}
fmt.Println(body)
}

Parsing the HTML

Write the test:

bookmark_test.go
func TestParseHTML(t *testing.T) {
// I can actually read this from the sample.html file
sampleHTML := `<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Description goes here">
<meta name="og:title" content="Go test">
<meta name="og:description" content="Description goes here">
<meta name="og:image" content="https://cdn1.iconfinder.com/data/icons/google-s-logo/150/Google_Icons-09-1024.png">
<title>Go test</title>
</head>
<body>
<div>
Hello world
</div>
</body>
</html>`
htmlBytes := []byte(sampleHTML)
got, err := ParseHTML(htmlBytes)
if err != nil {
t.Errorf("Unable to parse HTML: %v", err)
}
expectedTitle := "Go test"
if got.Title != expectedTitle {
t.Errorf("Expected: %s, got: %s", expectedTitle, got.Title)
}
}

This gives an insight into the implmentation of the feature. I'll need a HTML parser that allows me fetch the details needed in the frontend, which are:

  • Title
  • Description
  • ImageURL (optional)

I found GoQuery, which built on top of the net/html library, to handle the HTML parsing:

Install GoQuery
go get github.com/PuerkitoBio/goquery

With GoQuery installed, I can work on the parsing logic:

bookmark.go
import (
// other imports
"strings"
"github.com/PuerkitoBio/goquery"
)
type Bookmark struct {
Title string `json:"title"`
Description string `json:"description"`
ImageURL string `json:"imageURL"`
}
func ParseHTML(html []byte) (Bookmark, error) {
doc, err := goquery.NewDocumentFromReader(bytes.NewBuffer(html))
if err != nil {
return Bookmark{}, err
}
bookmark := Bookmark{}
title := strings.Trim(doc.Find("title").Text(), "\n ")
bookmark.Title = title
doc.Find("meta").Each(func(i int, s *goquery.Selection) {
c, _ := s.Attr("name")
value, _ := s.Attr("content")
switch c {
case "description", "og:description":
bookmark.Description = value
case "og:image":
bookmark.ImageURL = value
default:
}
})
return bookmark, nil
}

Refactoring

With the parsing logic in place, I can now refactor the fetch test and finalise the function implementation:

Tying it all up

  • Create the main() function in the main.go file
  • Add the router
main.go
package main
import (
"fmt"
"log"
"net/http"
)
const PORT string = ":8081"
func main() {
mux := http.NewServeMux()
mux.HandleFunc("GET /fetch", FetchPageDetails)
fmt.Printf("Server running on port: %s\n", PORT)
log.Fatal(http.ListenAndServe(PORT, mux))
}

Questions

  • How do I handle pages that have anti-bot?
  • How should I handle missing og:image?
  • Where should I deploy? Coolify? Or a general cloud provider?
  • How should persisitency be handled? DB or Cache? (not needed now)
  • For the frontend, should the bookmark be loaded at build time or runtime?