logo

Migration from WordPress to Squarespace

Migrating from Squarespace to WordPress: script to externally grab external hosted image from the_content of each post and upload to media gallery and set featured image

We make a lot of Squarespace to WordPress migrations on regular basis. Companies run from Squarepace looking for flexibility and cost-effective powerful CMS to build whatever they want. Squarespace is not that bad as Wix, but has its pain points , which trigger massive migrations to self-hosted WordPress websites.

Migrating from Squarespace to WordPress, the painful process of setting featured images.

When migrating from Squarespace to WordPress the migration of hundreds of posts can be painful. The default export file will allow you to add the posts simply, that’s good. But in my case, because the images hosted @ Squarespace where part of the body , these were not picked up as featured image when importing… then if you see also they don’t have a extension, we add png to make them upload-able.

For this specific use case you can add in your functions.php theme file or in a plugin (the quickest is the first one) this PHP code snippet will loop through all the posts, check if each post has a featured image, and if not, it will look for an image URL in the <noscript  tag within the content of that post. If it finds such an image, it will upload it and set it as the featured image for the post.

How to use the script

Copy and paste. Make a backup if you don’t wan’t to lose your job.

After adding it to functions.php open the site and wait.

Turn on the logs!

Before all:

  • Check the amount of media items in the library
  • Make sure to turn on logs on the wp-config.php file on your wp root folder:
define( 'WP_DEBUG', true );
define( 'WP_DEBUG_LOG', true );
define( 'WP_DEBUG_DISPLAY', false );

Add the script to functions.php and… open your website!

The script for functions.php

function set_featured_image_for_all_posts() {
    // Query to get all posts
    $args = array(
        'post_type' => 'post',
        'post_status' => 'publish',
        'posts_per_page' => -1,
    );
    $query = new WP_Query($args);

    // Loop through each post
    if ($query->have_posts()) {
        while ($query->have_posts()) {
            $query->the_post();
            $post_id = get_the_ID();

            // Skip if the post already has a featured image
            if (has_post_thumbnail($post_id)) {
                error_log("Post $post_id already has a featured image. Skipping.");
                continue;
            }

            // Get the content of the post
            $post = get_post($post_id);
            $content = $post->post_content;

            // Use regex to find img src within noscript tags
            if (preg_match('/<noscript><img src="([^"]+)"/', $content, $matches)) {
                $image_url = $matches[1];
                error_log("Image URL found for post $post_id: $image_url");

                // Upload and attach image to post as a featured image
                require_once(ABSPATH . 'wp-admin/includes/image.php');
                require_once(ABSPATH . 'wp-admin/includes/file.php');
                require_once(ABSPATH . 'wp-admin/includes/media.php');

                $tmp = download_url($image_url);
                $file_array = array(
                    'name' => basename($image_url) . '.png',  // Manually appending the .png extension
                    'tmp_name' => $tmp,
                );

                $attach_id = media_handle_sideload($file_array, $post_id);

                if (!is_wp_error($attach_id)) {
                    set_post_thumbnail($post_id, $attach_id);
                    error_log("Featured image set for post $post_id.");
                } else {
                    error_log("Error uploading image for post $post_id: " . $attach_id->get_error_message());
                }
            } else {
                error_log("No matching image URL found in post $post_id.");
            }
        }
    }
    wp_reset_postdata();
}

// Hook the function to wp_loaded or another appropriate action
add_action('wp_loaded', 'set_featured_image_for_all_posts');

Validations

  • Check the logs.
  • Check the amount of media items in the library.
  • Remove the code from functions.php.
  • Disable the logs.

 

Part 2: cleaning the horrid HTML leftovers from Squarspace

We have the featured image in our database! Awesome, but the_content of each post contains dirty HTML tags with a noscript tag too! and urls of Squarespace. The html also renders bad in WordPress frontend causing a weird massive padding bottom. This code is part of the content in Squarespace, and now we have it in the context of WordPress and it just breaks our website.

The following script is meant to loop all posts and on each post check with a regex for this div with class image-block-outer-wrapper which is the container of all the undesired Migration from Squarespace to WordPress.

Don’t forget to make a backup. This script of course can be used to sanitize posts of undesired HTML too, use with caution. Place in your functions.php file and open the site. Check the logs, check the posts. I have included two pieces of code, the first was my test and the then I decided to run over all the posts, being that said, do not run both together!

Removing the undesired code from a single post first to test the script

// First I try to remove the block of unsidered code from 1 post:

function remove_custom_html_block_from_single_post($post_id) {
    $post = get_post($post_id);
    
    if ($post) {
        $content = $post->post_content;

        $pattern = '/<div[^>]*class="\s*image-block-outer-wrapper[^>]*>.*?<\/div>/is';

        if (preg_match($pattern, $content)) {
            $cleaned_content = preg_replace($pattern, '', $content);

            wp_update_post(array(
                'ID' => $post_id,
                'post_content' => $cleaned_content,
            ));

            error_log("Post $post_id updated successfully");
        } else {
            error_log("No matching HTML block found in post $post_id");
        }
    } else {
        error_log("Failed to retrieve post $post_id");
    }
}

$post_id = 224;


// This function will instead go across all the existing posts.

// Run the function
remove_custom_html_block_from_single_post($post_id);

function remove_custom_html_block_from_all_posts() {
    $args = array(
        'post_type' => 'post',
        'posts_per_page' => -1,
    );
    $query = new WP_Query($args);


    if ($query->have_posts()) {
        while ($query->have_posts()) {
            $query->the_post();
            $post_id = get_the_ID();
            $content = get_the_content();

            $pattern = '/<div[^>]*class="\s*image-block-outer-wrapper[^>]*>.*?<\/div>/is';

            if (preg_match($pattern, $content)) {
                $cleaned_content = preg_replace($pattern, '', $content);

                wp_update_post(array(
                    'ID' => $post_id,
                    'post_content' => $cleaned_content,
                ));
            }
        }
        wp_reset_postdata();
    }

}

// Run the function (you may want to trigger this function in a safer way, e.g., via a custom admin action)
remove_custom_html_block_from_all_posts();